Python for Data Science #5: Mastering Data Visualization with Matplotlib, Seaborn & Plotly
Key Takeaways
Mastering data visualization with Matplotlib, Seaborn, and Plotly in Python
Full Transcript
So you all know that we are building our portfolio in the world of Python especially for people who are new to the tech world and people who are aspiring to be data scientist, data analyst, gener. So here we are building a portfolio in the world of Python and today we are going to discuss about the content where we can use Python to deliver visualizations, stories, you know your insights. This is where the fun begins where your data gets converted into a beautiful art. So to do that, Python allows various packages. As you know in our previous sessions or in recordings you might have seen that you know we have used Python packages like pandas, numis, stats, math and similarly we have something called math plot like cb plotly. What these packages do they allow us to play with our data convert those boring data into beautiful charts that conveys an amazing story. So today you and I going to learn about these charts and visualization in Pyth. We are learning about packages that can help us convert our data into beautiful visuals. So for that I'm using package called Mplot li which again is a wonderful package that can convert our data into charts. Right. Numpy that can deal with mathematical operations. So I'm calling an alias on mattplot lip as plt and an alias on numpy as np. So you must be wondering what these data points are going to do. Right? So if I show you I add one more cell here. Add this and run it. Right? So you see numpy. Space has created numbers. How many of them? 100 of them between 0 to 10 randomly generated numbers. Then I'm passing those values to a function mathematical function sign function and saving them in x and y variable. So on x-axis I have 100 numbers starting from 0 to 10 and on y-axis I have sign of that like sine of 0 sine of 0.1 s of 0 0.2 like all those values. Then using mattplot lip I'm defining a figure right? You see this figure it's of length six and four height. Then I'm saying plot a chart with x data on x-axis and y data on yaxis. Label it sin x. You see this label sin x and the color will be blue. Title of the chart is baseline plot. X label is X-axis, Y label is Yaxis, legends are drawn here and then convert it into a grid. As you can see the tabular form and print show. So when I run this cell, it generates a beautiful chart for me. Now imagine you have data like age versus salary, engine size versus speed of the car. I can pass those values here and convert that into a beautiful visual. That's not the end of it that we have more. Then let's make complex charts. What if you have some you know intensities? You have multiple dimensions of data. So we have an advanced package built on top of mattplot lip called seabon. Seabon is again a package that is being imported as SNS. Seabon is a way more complex package. It has its own data sets also. As you can see, I have imported a data set. You remember in our last class, we converted our data into uh data frames. Right? So, this is my data frame. Pandas data frame. You remember head tail, we talked about this in the last recording. So, what I'm doing is loading this predefined data from my seaborn package load data set called tips. Then I'm plotting three charts. A distribution chart, a box chart, and a heat map. For each chart, I'm writing four lines of code. Defining the area that I want for the chart. It's a histogram with data coming from tips data frame and a column name total bill. Kernel density estimate true and the color is purple. Title of the chart is distribution of total bill. Let's look at this chart. It's a histogram. We all agree. Kernel density estimate is this line which gives it a normal distribution curve. You know how the data flows and the color is purple with the title as distribution of total bill and plt show means it is telling my compiler to print that chart. Second again a new figure of length and breadth. It's a box plot. What's a box plot? It helps us identify the minimum value outlars to a maximum value with outliers and it gives me a range of my data. Helps me identify the median my uh you know uh distribution of my data whether it's right skewed left skewed we can identify a lot of things with it. Generally it is used for outlier detection in machine learning. So on x-axis is the number of days and y-axis total bill data frame tips and pallet is the color density and uh the format and all of those things. Title remains box plot of total bill by day. Look at this chart. On Thursday the bill starts roughly from $8 till goes till $40 plus. On Saturday the sale comes as low as $2 and it goes above $50 also. So see how visually I can compare my sales. Obviously the sales are good during weekends but the range is high. There are certain transactions which are very low and certain transactions which are extremely high. That's the beauty of your visualization. And the last chart is correlation. Correlation tells me how one column is correlated with other. Correlation and statistical things you will learn in the stats classes. For now consider it when something grows together or in a reverse fashion correlation can capture that thing. As simple as that. If I keep on increasing the engine size the mileage will come down negative correlation. If I increase the engine size speed increases of the car positive correlation. And then there is a case of no correlation. But those things you will learn in stats class. So I'm using data frame calling the correlation function only numerical columns because correlation is for numerical data and pass this from a heat map you see this thing annotation true means give me the degree between correlation values is between minus1 to one okay and these are like color and the format the format of the text so you see the heat map how the numerical columns are related to each other are they positively related negatively related or no correlation. In our case, everything is positively related. So that's the mattplot lip and seabbot. Then we can do some formatting. You know that's not the end of it. We can use some predefined styling just like in web development. Once a basic website is done, you improve it like you know you add some fancy visuals and all that thing. So here also I'm adding those things. I can I can plot a line chart one on top of other. You see two line charts sine and co cosine. You see the sign curve and cosine. So I can plot two charts on one chart also. Sometimes you have to compare sales in Germany versus sales in France. Right? So I can do that also in the same chart. Right? I can even add annotation on top of it. Look at here the second chart. You see this? I'm adding an arrow button. That also I can do using the annotation keyword. It allows us sometimes in in in presentations you might have seen you know people are pointing out okay this is the month where the sale went down. This is the month where the sale goes up. This is how through Python you can add those fancy visuals on it. That's not enough. You know the real world has amazing data in terms of time series like stock market, gold prices, GDP, petroleum cost is uh crude oil prices and whatn not. So I'm creating a date using pandas which will start from 1st January 2024 till it will be f March 1st March and the frequency will be day like every day second 1st January 2nd January 3rd January till 1st March then I'll randomly assign a number random number for each day which will be a cumulative sum like the previous number add on random number to keep on adding it and convert that these two into a dictionary. Key value key value. This is a key. This is a value. This is a key. This is a value. Pass that dictionary into a data frame and convert this into a data frame. Pandas data frame. Now I have created a random data. Converted into a data frame and then plotting a chart. See time series data with random values. Cumulative sum means whatever the value was there now I'll add it. It could be negative also I'm generating a random number here and eventually a time series data has been generated. I mean I can do this through macplot also and I can do this using seaborn also. See, so as of now you have the capability of converting your boring data into amazing visuals using packages like mapplot lip or seab ball. This is what we use Python for, right? To make our life easy, right? Now you must be wondering all these charts are dead, right? I mean they they're two-dimensional non-interactive charts. I mean they are beautified. They are telling a story that serves the purpose. But can I get an interaction? What if I put a hover over over my mouse here and the values come alive? Don't worry, we have a solution for that too. There is a package called plotly which makes interactive charts. So I'm importing a package called plotly px a plotly chart pio. I'm rendering this notebook to make sure it is able to interact and then building a scatter plot data frame name x-axis yaxis color size and title everything in one function as a parameter to my scatter plot and I'm doing show look at this chart see if I hover over it tells me it's Sunday bill was $20.69 $69 tip was five size was five. Isn't it amazing? Right? So you can make a interactive chart like this. Similarly on time series look at here the previous time series was dead right? I couldn't figure out I have to eyeball here like what is this value? I don't know. But look at this chart. Simp uh m plot sorry plotly line chart data frame name x-axis yaxis title of the chart and it is making a interactive chart. Isn't it fun? Just imagine you convert your data into such visuals and go to a presentation use them in slides and tell them what is the business doing good if their business is doing bad and how amazing it would be through coding. And that was the agenda of our today's session. Right? So I hope you enjoyed this thing the world of MACplot lib seabbond and plotly right. So go out there play with these notebooks. You will get the access to these data. You can run them on Google collab jupyter notebooks or visual studio with python interactive enabled. And if you have any questions, queries come reach out to us. We'll be happy to, you know, teach you, interact with you, grow with you, and collaborate with you. Until then, I'm Jent Mahara, and thank you very much.
Original Description
Transform your boring data into beautiful, insightful visualizations! In this tutorial, we dive into the world of data visualization with Python, a crucial skill for any aspiring data scientist or analyst. We'll explore three powerful libraries—Matplotlib, Seaborn, and Plotly—to create stunning static and interactive charts that tell a compelling story.
Starting with the fundamentals of Matplotlib for basic plotting, we'll then move to Seaborn for more complex statistical charts like histograms, box plots, and heatmaps. Finally, we'll unlock the power of interactive visualizations with Plotly, allowing you to create dynamic charts where you can hover to see data points and explore your insights in real-time. This video is perfect for beginners looking to build their Python portfolio and learn how to effectively communicate findings through data.
In this video, you will learn:
- Matplotlib Fundamentals: Create and customize basic plots like line charts.
- Advanced Plotting with Seaborn: Generate sophisticated statistical visuals including histograms, box plots for outlier analysis, and correlation heatmaps.
- Interactive Charts with Plotly: Build interactive scatter plots and time-series charts that bring your data to life.
- Data Preparation: Use NumPy and Pandas to generate and structure data for plotting.
- Customization Techniques: Learn how to add titles, labels, legends, annotations, and custom styling to your charts.
- Time Series Visualization: Plot time-dependent data to analyze trends over time.
Timestamps
0:00 - Introduction: Turning Data into Art with Python
1:14 - Getting Started with Matplotlib & NumPy
1:35 - Creating Your First Line Plot (Sine Wave)
3:27 - Advanced Visualization with Seaborn
4:00 - Plotting a Histogram (Distribution Plot)
4:57 - Creating a Box Plot to Analyze Data Distribution
6:11 - Understanding Correlation with a Heatmap
7:30 - Customizing Plots: Multiple Lines & Annotations
8:36 - Visualizing Time Series Data
10:40 - Creating In
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Analytics Vidhya · Analytics Vidhya · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
The DataHour: Data Science in Retail
Analytics Vidhya
The DataHour: Anomaly detection using NLP and Predictive Modeling
Analytics Vidhya
The DataHour: Energy Data Science Project from Scratch
Analytics Vidhya
The DataHour: Explainable AI Need and Implementation
Analytics Vidhya
The DataHour: Google Cloud AI/ML
Analytics Vidhya
Prediction to Production in Machine Learning #machinelearning #prediction
Analytics Vidhya
Practical Applications of Data science in Ecommerce
Analytics Vidhya
How to tackle Overfitting?#machinelearning #overfitting
Analytics Vidhya
Building Data Pipelines on GCP #googlecloud #datapipelines #data
Analytics Vidhya
Hands-on with A/B Testing #abtesting #datascience
Analytics Vidhya
Efficient Implementations of Transformers #transformers #cnn #machinelearning
Analytics Vidhya
Modern Deep Learning Architecture #deeplearning #architecture #deeplearningtutorial
Analytics Vidhya
Key steps for Designing Artificial Neural Network (ANN) for Image classification #machinelearning
Analytics Vidhya
5 things you should know about Azure SQL #azure #sql #datahour #datascience
Analytics Vidhya
AI & ML in the Automotive Industry #machinelearning #ai
Analytics Vidhya
Building Machine Learning Models in BigQuery
Analytics Vidhya
NLP aspects in Telecommunication Industry
Analytics Vidhya
Practical Time Series Analysis
Analytics Vidhya
Fundamentals of Quantum Computing
Analytics Vidhya
A DAY IN THE LIFE of a Data Scientist (From waking up to working on algorithms)
Analytics Vidhya
Classification Machine Learning Model from Scratch
Analytics Vidhya
Knowledge Graph Solutions using Neo4j
Analytics Vidhya
Model Guesstimation (MLOps)
Analytics Vidhya
ETL Pipelines in Google Cloud Platform
Analytics Vidhya
Key steps for Designing Convolutional Neural Network(CNN) for Image Classification
Analytics Vidhya
Getting Started with AWS EC2 #amazon #aws
Analytics Vidhya
How to Use Azure NLP and Graph Databases for Intelligent Knowledge Mining
Analytics Vidhya
Certified AI & ML BlackBelt Plus Program #shorts
Analytics Vidhya
Visualizing Data using Python #machinelearning #visualization #python
Analytics Vidhya
DCNN for Machine RUL Prediction using Time-series Data #timeseries #machinelearning #datascience
Analytics Vidhya
M in ML stands for Math & Magic
Analytics Vidhya
An Unsupervised ML approach using Clustering
Analytics Vidhya
Customizing Large Language Models GPT3 for Real-life Use Cases #gpt3 #datascience
Analytics Vidhya
Model Parameters vs Hyperparameters - Techniques in ML Engineering #machinelearning
Analytics Vidhya
Practical MLOps #mlops #datascience
Analytics Vidhya
Data Engineering with Databricks #dataengineering #databricks
Analytics Vidhya
Multi-Objective Optimisation
Analytics Vidhya
When Airflow Meets Kubernetes
Analytics Vidhya
AI in Banking
Analytics Vidhya
Learn Convolutional Neural Network for Image Recognition
Analytics Vidhya
Extracting Value from Data
Analytics Vidhya
How to measure Marketing Channel Effectiveness
Analytics Vidhya
Transforming Lives | Data Science Immersive Bootcamp
Analytics Vidhya
Stock Market Analysis - AI driven approach
Analytics Vidhya
Become a Data Engineering Professional in 2022 | Future Trends + Skills Required
Analytics Vidhya
Ensemble Techniques in Machine Learning #machinelearning #ensemble #datascience
Analytics Vidhya
The Power of Visualization | Tableau Full Course | Analytics Vidhya
Analytics Vidhya
Demand for Data Engineers is on the Rise | Data Engineer | Analytics Vidhya
Analytics Vidhya
Data Visualization in Data Science | DataHour | Analytics Vidhya
Analytics Vidhya
Role of Optimization in Machine Learning & Deep Learning | DataHour | Analytics Vidhya
Analytics Vidhya
Solving any Machine Learning Problem | Approach and Steps Involved
Analytics Vidhya
Topic Modeling Explained with Implementation | Using LDA in Python | DataHour by Arpendu Ganguly
Analytics Vidhya
Data Engineering in E-Commerce | The Best Case Study
Analytics Vidhya
Introduction to Classification using Azure Machine Learning | DataHour | Analytics Vidhya
Analytics Vidhya
Introduction to Federated Learning | DataHour | Analytics Vidhya
Analytics Vidhya
Diffusion Models for Generative Arts | DataHour | Analytics Vidhya
Analytics Vidhya
Master Google Analytics in 1 Hour | DataHour | Analytics Vidhya
Analytics Vidhya
Learn Hypothesis Testing | DataHour | Analytics Vidhya
Analytics Vidhya
A Practical Approach to Kaggle Competition | DataHour | Analytics Vidhya
Analytics Vidhya
Making AI work for Business | DataHour | Analytics Vidhya
Analytics Vidhya
More on: Data Literacy
View skill →Related Reads
Chapters (10)
Introduction: Turning Data into Art with Python
1:14
Getting Started with Matplotlib & NumPy
1:35
Creating Your First Line Plot (Sine Wave)
3:27
Advanced Visualization with Seaborn
4:00
Plotting a Histogram (Distribution Plot)
4:57
Creating a Box Plot to Analyze Data Distribution
6:11
Understanding Correlation with a Heatmap
7:30
Customizing Plots: Multiple Lines & Annotations
8:36
Visualizing Time Series Data
10:40
Creating In
🎓
Tutor Explanation
DeepCamp AI