Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib

NeuralNine · Beginner ·🛠️ AI Tools & Apps ·6y ago
In today's episode we are going to plot professional histograms with Matplotlib in Python. Website: https://www.neuralnine.com/ Instagram: https://www.instagram.com/neuralnine Twitter: https://twitter.com/neuralnine GitHub: https://github.com/NeuralNine Programming Books: https://www.neuralnine.com/books/ Outro Music From: https://www.bensound.com/ Subscribe and Like for more free content!

What You'll Learn

This video tutorial covers plotting histograms with Matplotlib in Python, a key tool in data science for visualizing data distributions.

Full Transcript

what is going on guys welcome to this Python tutorial series for data science in today's video we're going to learn about histograms how to plot them what they're used for and basically a histogram is just a type of plot that visualizes a statistic distribution so let us get into the code so this time our already imported math red lip and numpy as always so if you're not familiar with that watch the previous videos but we're just imported of basic to libraries now what we're going to do now is we're going to define or what we're going to visualize today is the heights of students so we're going to have a collection of students or a bunch of students that all have different heights and these heights are distributed in a normal distribution so if you're familiar with the Gaussian or I think it's called a bell curve so this is this is a principle of statistics when you have a normal distribution the values are distributed like a bell this is the Gaussian bell curve and what we're going to do in today's video is we're going to use a histogram to visualize that so the first thing we need to do is we need to define the values for the heights and I'm going to use two parameters you which are mu and Sigma which is the average or the mean and the standard deviation so this is statistics you don't need to understand it fully if you're not interested in statistics it's more about plotting today in the light is getting shitty but I guess it doesn't matter because you're not here to watch my face but to look at the code so what we're going to do is we're going to define a mu and a sigma and these values are as I said mu is the the average the mean and Sigma is the standard deviation so what we're going to do is we're going to say ok the average student is 172 centimeters high and yes I'm going to use the metric system because I'm not American and the standard deviation Sigma is on average how much do the heights of the students differ or deviate from that mean value oh and we can say okay the average student is or the average deviation is 8 centimeters so these are the two values that we're going to use and now we have to define or now we have to generate our student values what we're going to do for that is we're going to say ax equals mu plus sigma x NP random dot R and N and now we're going to specify the amount of students that we want to generate or Heights it won't generate I'm going to say a thousand and what happens here is I take the average value of the muder 172 centimeters and then I add the standard deviation times random numbers so sometimes a little bit more sometimes a little bit less and this Rand and function here what it does is it creates a random normal distribution so the n here stands for normal distribution so we're creating this bell curve that I talked about yeah and now what we have to do to visualize that is we have to use histograms because that's the plotting type that we use to do that so we're going to say PLT is odd hist which is the keyword with a function for a histogram and we're going to pass the ax value here now we're going to choose how many of these heights do we want to visualize do we want to visualize old thousand heights or are we just going to pick a hundred I'm going to pick a hundred here so we're visualizing from that X we're going to visualize a hundred different data points and the next thing is we're going to define a face color here I'm going to pick blue and basically that's it would just show the plot now and it should work out well yeah as you can see okay that's not a perfect bell curve maybe we should use 10,000 different samples yeah that looks much more like a bell curve so this is the typical Gaussian bell curve that I talked about and as you can see here down here we have the different heights we have from 140 centimeters up until 2 meters so 200 centimeters different heights and you can see how many students have this height now what we have on the left here on the y axis are not the probabilities or the percentages but we have absolute values or values that we don't want to have in that format so what we're going to do is we're going to say density equals true and that will convert this into the percentages so now as you can see we have five percent of students have these heights you are four percent have these Heights here three percent have these hides here and so on and very very few students have a height of 140 centimeters which is kind of realistic because there are not a lot of people with that height and also very little or very few of them have a height above 190 centimeters so that is kind of realistic now what we're going to do now is we're going to add some labeling but basically this is the histogram so we're done with the plotting but we're going to add some labeling here so we're going to say X label is the heights then we have P of T dot Y label is pers and off students then PLT dot title is going to be Heights off students and last but not least we could turn on the grid if you want so that's it basically and one more thing I'm going to add one more thing because I would like to have a text in this plot that tells me what mu and Sigma are so I'm going to place it somewhere like here which is 150 and 0.04 so I'm going to say PLT dot txt 150 and 0.04 and a text is mu equals 172 and I don't have a letter for Sigma here but let's say sick equals 8 this should work okay a little bit more to left maybe so let's say 145 should work out yes and that's it we're done with the plot right now we can see the distribution of the different heights we can see the MU and Sigma value and we have proper labeling so that is how you visualize a statistical distribution with matplotlib so that's it when it comes to histograms and a visualization of statistical distributions and tighten so if you like this video if you enjoyed it if you learned something hit the like button for me and also feel free to ask questions and give feedback in the comment section down below of course subscribe to this channel if you want to see more because we're getting deeper into data science and machine learning soon so stay tuned keep watching and thank you very much for watching this video so see you in the next video bye [Music] you
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from NeuralNine · NeuralNine · 31 of 60

1 Visualizing Stock Data With Candlestick Charts in Python
Visualizing Stock Data With Candlestick Charts in Python
NeuralNine
2 Python Beginner Tutorial #1 - Installation and First Program
Python Beginner Tutorial #1 - Installation and First Program
NeuralNine
3 Python Beginner Tutorial #2 - Variables and Data Types
Python Beginner Tutorial #2 - Variables and Data Types
NeuralNine
4 Python Beginner Tutorial #3 - Operators and User Input
Python Beginner Tutorial #3 - Operators and User Input
NeuralNine
5 Python Beginner Tutorial #4 - If Statements and Conditions
Python Beginner Tutorial #4 - If Statements and Conditions
NeuralNine
6 Python Beginner Tutorial #5 - Loops
Python Beginner Tutorial #5 - Loops
NeuralNine
7 Python Beginner Tutorial #6 - Sequences and Collections
Python Beginner Tutorial #6 - Sequences and Collections
NeuralNine
8 Python Beginner Tutorial #7 - Functions
Python Beginner Tutorial #7 - Functions
NeuralNine
9 Python Beginner Tutorial #8 - Exception Handling
Python Beginner Tutorial #8 - Exception Handling
NeuralNine
10 Python Beginner Tutorial #9 - File Operations
Python Beginner Tutorial #9 - File Operations
NeuralNine
11 Python Beginner Tutorial #10 - String Functions
Python Beginner Tutorial #10 - String Functions
NeuralNine
12 Python Intermediate Tutorial #1 - Classes and Objects
Python Intermediate Tutorial #1 - Classes and Objects
NeuralNine
13 Python Intermediate Tutorial #2 - Inheritance
Python Intermediate Tutorial #2 - Inheritance
NeuralNine
14 Python Intermediate Tutorial #3 - Multithreading
Python Intermediate Tutorial #3 - Multithreading
NeuralNine
15 Python Intermediate Tutorial #4 - Synchronizing Threads
Python Intermediate Tutorial #4 - Synchronizing Threads
NeuralNine
16 Python Intermediate Tutorial #5 - Events and Daemon Threads
Python Intermediate Tutorial #5 - Events and Daemon Threads
NeuralNine
17 Python Intermediate Tutorial #6 - Queues
Python Intermediate Tutorial #6 - Queues
NeuralNine
18 Python Intermediate Tutorial #7 - Sockets and Network Programming
Python Intermediate Tutorial #7 - Sockets and Network Programming
NeuralNine
19 Python Intermediate Tutorial #8 - Database Programming
Python Intermediate Tutorial #8 - Database Programming
NeuralNine
20 Python Intermediate Tutorial #9 - Recursion
Python Intermediate Tutorial #9 - Recursion
NeuralNine
21 Python Intermediate Tutorial #10 - XML Processing
Python Intermediate Tutorial #10 - XML Processing
NeuralNine
22 Python Intermediate Tutorial #11 - Logging
Python Intermediate Tutorial #11 - Logging
NeuralNine
23 Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
NeuralNine
24 Python Data Science Tutorial #2 - NumPy Arrays
Python Data Science Tutorial #2 - NumPy Arrays
NeuralNine
25 Python Data Science Tutorial #3 - Numpy Functions
Python Data Science Tutorial #3 - Numpy Functions
NeuralNine
26 Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
NeuralNine
27 Python Data Science Tutorial #5 - Subplots and Multiple Windows
Python Data Science Tutorial #5 - Subplots and Multiple Windows
NeuralNine
28 Python Data Science Tutorial #6 - Matplotlib Styling
Python Data Science Tutorial #6 - Matplotlib Styling
NeuralNine
29 Python Data Science Tutorial #7 - Bar Charts with Matplotlib
Python Data Science Tutorial #7 - Bar Charts with Matplotlib
NeuralNine
30 Python Data Science Tutorial #8 - Pie Charts with Matplotlib
Python Data Science Tutorial #8 - Pie Charts with Matplotlib
NeuralNine
Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
NeuralNine
32 Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
NeuralNine
33 Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
NeuralNine
34 Python Data Science Tutorial #12 - Pandas Series
Python Data Science Tutorial #12 - Pandas Series
NeuralNine
35 Python Data Science Tutorial #13 - Pandas Data Frames
Python Data Science Tutorial #13 - Pandas Data Frames
NeuralNine
36 Python Data Science Tutorial #14 - Pandas Statistics
Python Data Science Tutorial #14 - Pandas Statistics
NeuralNine
37 Python Data Science Tutorial #15 - Pandas Sorting and Functions
Python Data Science Tutorial #15 - Pandas Sorting and Functions
NeuralNine
38 Python Data Science Tutorial #16 - Pandas Merging Data Frames
Python Data Science Tutorial #16 - Pandas Merging Data Frames
NeuralNine
39 Python Data Science Tutorial #17 - Pandas Queries
Python Data Science Tutorial #17 - Pandas Queries
NeuralNine
40 Python Machine Learning Tutorial #1 - What is Machine Learning?
Python Machine Learning Tutorial #1 - What is Machine Learning?
NeuralNine
41 Python Machine Learning Tutorial #2 - Linear Regression
Python Machine Learning Tutorial #2 - Linear Regression
NeuralNine
42 Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
NeuralNine
43 Python Machine Learning #4 - Support Vector Machines
Python Machine Learning #4 - Support Vector Machines
NeuralNine
44 Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
NeuralNine
45 Python Machine Learning Tutorial #6 - K-Means Clustering
Python Machine Learning Tutorial #6 - K-Means Clustering
NeuralNine
46 Python Machine Learning Tutorial #7 - Neural Networks
Python Machine Learning Tutorial #7 - Neural Networks
NeuralNine
47 Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
NeuralNine
48 Generating Poetic Texts with Recurrent Neural Networks in Python
Generating Poetic Texts with Recurrent Neural Networks in Python
NeuralNine
49 Stock Portfolio Visualization with Matplotlib in Python
Stock Portfolio Visualization with Matplotlib in Python
NeuralNine
50 Analyzing Coronavirus with Python (COVID-19)
Analyzing Coronavirus with Python (COVID-19)
NeuralNine
51 Making Text Images Readable Again with Python and OpenCV
Making Text Images Readable Again with Python and OpenCV
NeuralNine
52 Neural Networks Simply Explained (Theory)
Neural Networks Simply Explained (Theory)
NeuralNine
53 Motion Filtering with OpenCV in Python
Motion Filtering with OpenCV in Python
NeuralNine
54 Top 5 Programming Languages To Learn in 2020
Top 5 Programming Languages To Learn in 2020
NeuralNine
55 Simple TCP Chat Room in Python
Simple TCP Chat Room in Python
NeuralNine
56 Image Classification with Neural Networks in Python
Image Classification with Neural Networks in Python
NeuralNine
57 Edge Detection with OpenCV in Python
Edge Detection with OpenCV in Python
NeuralNine
58 S&P 500 Web Scraping with Python
S&P 500 Web Scraping with Python
NeuralNine
59 Simple Sentiment Text Analysis in Python
Simple Sentiment Text Analysis in Python
NeuralNine
60 Introduction - Algorithms & Data Structures #1
Introduction - Algorithms & Data Structures #1
NeuralNine

This video teaches how to plot professional histograms with Matplotlib in Python, which is essential for data science tasks. By watching this tutorial, viewers can learn how to effectively visualize and analyze data distributions. The tutorial is beginner-friendly and focuses on practical applications.

Key Takeaways
  1. Install Matplotlib library
  2. Import Matplotlib in Python
  3. Prepare data for histogram plotting
  4. Use Matplotlib functions to plot histograms
  5. Customize histogram appearance
  6. Interpret histogram results
💡 Matplotlib is a powerful library for creating high-quality 2D and 3D plots, including histograms, which are crucial for understanding data distributions in data science.

Related AI Lessons

Up next
My husband refuses to hug me
Jefferson Fisher
Watch →