Python for Data Science #5: Mastering Data Visualization with Matplotlib, Seaborn & Plotly

Analytics Vidhya · Beginner ·🔢 Mathematical Foundations ·6mo ago

Key Takeaways

Mastering data visualization with Matplotlib, Seaborn, and Plotly in Python

Full Transcript

So you all know that we are building our portfolio in the world of Python especially for people who are new to the tech world and people who are aspiring to be data scientist, data analyst, gener. So here we are building a portfolio in the world of Python and today we are going to discuss about the content where we can use Python to deliver visualizations, stories, you know your insights. This is where the fun begins where your data gets converted into a beautiful art. So to do that, Python allows various packages. As you know in our previous sessions or in recordings you might have seen that you know we have used Python packages like pandas, numis, stats, math and similarly we have something called math plot like cb plotly. What these packages do they allow us to play with our data convert those boring data into beautiful charts that conveys an amazing story. So today you and I going to learn about these charts and visualization in Pyth. We are learning about packages that can help us convert our data into beautiful visuals. So for that I'm using package called Mplot li which again is a wonderful package that can convert our data into charts. Right. Numpy that can deal with mathematical operations. So I'm calling an alias on mattplot lip as plt and an alias on numpy as np. So you must be wondering what these data points are going to do. Right? So if I show you I add one more cell here. Add this and run it. Right? So you see numpy. Space has created numbers. How many of them? 100 of them between 0 to 10 randomly generated numbers. Then I'm passing those values to a function mathematical function sign function and saving them in x and y variable. So on x-axis I have 100 numbers starting from 0 to 10 and on y-axis I have sign of that like sine of 0 sine of 0.1 s of 0 0.2 like all those values. Then using mattplot lip I'm defining a figure right? You see this figure it's of length six and four height. Then I'm saying plot a chart with x data on x-axis and y data on yaxis. Label it sin x. You see this label sin x and the color will be blue. Title of the chart is baseline plot. X label is X-axis, Y label is Yaxis, legends are drawn here and then convert it into a grid. As you can see the tabular form and print show. So when I run this cell, it generates a beautiful chart for me. Now imagine you have data like age versus salary, engine size versus speed of the car. I can pass those values here and convert that into a beautiful visual. That's not the end of it that we have more. Then let's make complex charts. What if you have some you know intensities? You have multiple dimensions of data. So we have an advanced package built on top of mattplot lip called seabon. Seabon is again a package that is being imported as SNS. Seabon is a way more complex package. It has its own data sets also. As you can see, I have imported a data set. You remember in our last class, we converted our data into uh data frames. Right? So, this is my data frame. Pandas data frame. You remember head tail, we talked about this in the last recording. So, what I'm doing is loading this predefined data from my seaborn package load data set called tips. Then I'm plotting three charts. A distribution chart, a box chart, and a heat map. For each chart, I'm writing four lines of code. Defining the area that I want for the chart. It's a histogram with data coming from tips data frame and a column name total bill. Kernel density estimate true and the color is purple. Title of the chart is distribution of total bill. Let's look at this chart. It's a histogram. We all agree. Kernel density estimate is this line which gives it a normal distribution curve. You know how the data flows and the color is purple with the title as distribution of total bill and plt show means it is telling my compiler to print that chart. Second again a new figure of length and breadth. It's a box plot. What's a box plot? It helps us identify the minimum value outlars to a maximum value with outliers and it gives me a range of my data. Helps me identify the median my uh you know uh distribution of my data whether it's right skewed left skewed we can identify a lot of things with it. Generally it is used for outlier detection in machine learning. So on x-axis is the number of days and y-axis total bill data frame tips and pallet is the color density and uh the format and all of those things. Title remains box plot of total bill by day. Look at this chart. On Thursday the bill starts roughly from $8 till goes till $40 plus. On Saturday the sale comes as low as $2 and it goes above $50 also. So see how visually I can compare my sales. Obviously the sales are good during weekends but the range is high. There are certain transactions which are very low and certain transactions which are extremely high. That's the beauty of your visualization. And the last chart is correlation. Correlation tells me how one column is correlated with other. Correlation and statistical things you will learn in the stats classes. For now consider it when something grows together or in a reverse fashion correlation can capture that thing. As simple as that. If I keep on increasing the engine size the mileage will come down negative correlation. If I increase the engine size speed increases of the car positive correlation. And then there is a case of no correlation. But those things you will learn in stats class. So I'm using data frame calling the correlation function only numerical columns because correlation is for numerical data and pass this from a heat map you see this thing annotation true means give me the degree between correlation values is between minus1 to one okay and these are like color and the format the format of the text so you see the heat map how the numerical columns are related to each other are they positively related negatively related or no correlation. In our case, everything is positively related. So that's the mattplot lip and seabbot. Then we can do some formatting. You know that's not the end of it. We can use some predefined styling just like in web development. Once a basic website is done, you improve it like you know you add some fancy visuals and all that thing. So here also I'm adding those things. I can I can plot a line chart one on top of other. You see two line charts sine and co cosine. You see the sign curve and cosine. So I can plot two charts on one chart also. Sometimes you have to compare sales in Germany versus sales in France. Right? So I can do that also in the same chart. Right? I can even add annotation on top of it. Look at here the second chart. You see this? I'm adding an arrow button. That also I can do using the annotation keyword. It allows us sometimes in in in presentations you might have seen you know people are pointing out okay this is the month where the sale went down. This is the month where the sale goes up. This is how through Python you can add those fancy visuals on it. That's not enough. You know the real world has amazing data in terms of time series like stock market, gold prices, GDP, petroleum cost is uh crude oil prices and whatn not. So I'm creating a date using pandas which will start from 1st January 2024 till it will be f March 1st March and the frequency will be day like every day second 1st January 2nd January 3rd January till 1st March then I'll randomly assign a number random number for each day which will be a cumulative sum like the previous number add on random number to keep on adding it and convert that these two into a dictionary. Key value key value. This is a key. This is a value. This is a key. This is a value. Pass that dictionary into a data frame and convert this into a data frame. Pandas data frame. Now I have created a random data. Converted into a data frame and then plotting a chart. See time series data with random values. Cumulative sum means whatever the value was there now I'll add it. It could be negative also I'm generating a random number here and eventually a time series data has been generated. I mean I can do this through macplot also and I can do this using seaborn also. See, so as of now you have the capability of converting your boring data into amazing visuals using packages like mapplot lip or seab ball. This is what we use Python for, right? To make our life easy, right? Now you must be wondering all these charts are dead, right? I mean they they're two-dimensional non-interactive charts. I mean they are beautified. They are telling a story that serves the purpose. But can I get an interaction? What if I put a hover over over my mouse here and the values come alive? Don't worry, we have a solution for that too. There is a package called plotly which makes interactive charts. So I'm importing a package called plotly px a plotly chart pio. I'm rendering this notebook to make sure it is able to interact and then building a scatter plot data frame name x-axis yaxis color size and title everything in one function as a parameter to my scatter plot and I'm doing show look at this chart see if I hover over it tells me it's Sunday bill was $20.69 $69 tip was five size was five. Isn't it amazing? Right? So you can make a interactive chart like this. Similarly on time series look at here the previous time series was dead right? I couldn't figure out I have to eyeball here like what is this value? I don't know. But look at this chart. Simp uh m plot sorry plotly line chart data frame name x-axis yaxis title of the chart and it is making a interactive chart. Isn't it fun? Just imagine you convert your data into such visuals and go to a presentation use them in slides and tell them what is the business doing good if their business is doing bad and how amazing it would be through coding. And that was the agenda of our today's session. Right? So I hope you enjoyed this thing the world of MACplot lib seabbond and plotly right. So go out there play with these notebooks. You will get the access to these data. You can run them on Google collab jupyter notebooks or visual studio with python interactive enabled. And if you have any questions, queries come reach out to us. We'll be happy to, you know, teach you, interact with you, grow with you, and collaborate with you. Until then, I'm Jent Mahara, and thank you very much.

Original Description

Transform your boring data into beautiful, insightful visualizations! In this tutorial, we dive into the world of data visualization with Python, a crucial skill for any aspiring data scientist or analyst. We'll explore three powerful libraries—Matplotlib, Seaborn, and Plotly—to create stunning static and interactive charts that tell a compelling story. Starting with the fundamentals of Matplotlib for basic plotting, we'll then move to Seaborn for more complex statistical charts like histograms, box plots, and heatmaps. Finally, we'll unlock the power of interactive visualizations with Plotly, allowing you to create dynamic charts where you can hover to see data points and explore your insights in real-time. This video is perfect for beginners looking to build their Python portfolio and learn how to effectively communicate findings through data. In this video, you will learn: - Matplotlib Fundamentals: Create and customize basic plots like line charts. - Advanced Plotting with Seaborn: Generate sophisticated statistical visuals including histograms, box plots for outlier analysis, and correlation heatmaps. - Interactive Charts with Plotly: Build interactive scatter plots and time-series charts that bring your data to life. - Data Preparation: Use NumPy and Pandas to generate and structure data for plotting. - Customization Techniques: Learn how to add titles, labels, legends, annotations, and custom styling to your charts. - Time Series Visualization: Plot time-dependent data to analyze trends over time. Timestamps 0:00 - Introduction: Turning Data into Art with Python 1:14 - Getting Started with Matplotlib & NumPy 1:35 - Creating Your First Line Plot (Sine Wave) 3:27 - Advanced Visualization with Seaborn 4:00 - Plotting a Histogram (Distribution Plot) 4:57 - Creating a Box Plot to Analyze Data Distribution 6:11 - Understanding Correlation with a Heatmap 7:30 - Customizing Plots: Multiple Lines & Annotations 8:36 - Visualizing Time Series Data 10:40 - Creating In
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Analytics Vidhya · Analytics Vidhya · 0 of 60

← Previous Next →
1 The DataHour: Data Science in Retail
The DataHour: Data Science in Retail
Analytics Vidhya
2 The DataHour: Anomaly detection using NLP and Predictive Modeling
The DataHour: Anomaly detection using NLP and Predictive Modeling
Analytics Vidhya
3 The DataHour: Energy Data Science Project from Scratch
The DataHour: Energy Data Science Project from Scratch
Analytics Vidhya
4 The DataHour: Explainable AI Need and Implementation
The DataHour: Explainable AI Need and Implementation
Analytics Vidhya
5 The DataHour: Google Cloud AI/ML
The DataHour: Google Cloud AI/ML
Analytics Vidhya
6 Prediction to Production in Machine Learning #machinelearning #prediction
Prediction to Production in Machine Learning #machinelearning #prediction
Analytics Vidhya
7 Practical Applications of Data science in Ecommerce
Practical Applications of Data science in Ecommerce
Analytics Vidhya
8 How to tackle Overfitting?#machinelearning #overfitting
How to tackle Overfitting?#machinelearning #overfitting
Analytics Vidhya
9 Building Data Pipelines on GCP #googlecloud #datapipelines #data
Building Data Pipelines on GCP #googlecloud #datapipelines #data
Analytics Vidhya
10 Hands-on with A/B Testing #abtesting #datascience
Hands-on with A/B Testing #abtesting #datascience
Analytics Vidhya
11 Efficient Implementations of Transformers #transformers #cnn  #machinelearning
Efficient Implementations of Transformers #transformers #cnn #machinelearning
Analytics Vidhya
12 Modern Deep Learning Architecture #deeplearning  #architecture #deeplearningtutorial
Modern Deep Learning Architecture #deeplearning #architecture #deeplearningtutorial
Analytics Vidhya
13 Key steps for Designing Artificial Neural Network (ANN) for Image classification #machinelearning
Key steps for Designing Artificial Neural Network (ANN) for Image classification #machinelearning
Analytics Vidhya
14 5 things you should know about Azure SQL #azure #sql #datahour #datascience
5 things you should know about Azure SQL #azure #sql #datahour #datascience
Analytics Vidhya
15 AI & ML in the Automotive Industry #machinelearning #ai
AI & ML in the Automotive Industry #machinelearning #ai
Analytics Vidhya
16 Building Machine Learning Models in BigQuery
Building Machine Learning Models in BigQuery
Analytics Vidhya
17 NLP aspects in Telecommunication Industry
NLP aspects in Telecommunication Industry
Analytics Vidhya
18 Practical Time Series Analysis
Practical Time Series Analysis
Analytics Vidhya
19 Fundamentals of Quantum Computing
Fundamentals of Quantum Computing
Analytics Vidhya
20 A DAY IN THE LIFE of a Data Scientist (From waking up to working on algorithms)
A DAY IN THE LIFE of a Data Scientist (From waking up to working on algorithms)
Analytics Vidhya
21 Classification Machine Learning Model from Scratch
Classification Machine Learning Model from Scratch
Analytics Vidhya
22 Knowledge Graph Solutions using Neo4j
Knowledge Graph Solutions using Neo4j
Analytics Vidhya
23 Model Guesstimation (MLOps)
Model Guesstimation (MLOps)
Analytics Vidhya
24 ETL Pipelines in Google Cloud Platform
ETL Pipelines in Google Cloud Platform
Analytics Vidhya
25 Key steps for Designing Convolutional Neural Network(CNN) for Image Classification
Key steps for Designing Convolutional Neural Network(CNN) for Image Classification
Analytics Vidhya
26 Getting Started with AWS EC2 #amazon #aws
Getting Started with AWS EC2 #amazon #aws
Analytics Vidhya
27 How to Use Azure NLP and Graph Databases for Intelligent Knowledge Mining
How to Use Azure NLP and Graph Databases for Intelligent Knowledge Mining
Analytics Vidhya
28 Certified AI & ML BlackBelt Plus Program #shorts
Certified AI & ML BlackBelt Plus Program #shorts
Analytics Vidhya
29 Visualizing Data using Python #machinelearning #visualization #python
Visualizing Data using Python #machinelearning #visualization #python
Analytics Vidhya
30 DCNN for Machine RUL Prediction using Time-series Data #timeseries #machinelearning #datascience
DCNN for Machine RUL Prediction using Time-series Data #timeseries #machinelearning #datascience
Analytics Vidhya
31 M in ML stands for Math & Magic
M in ML stands for Math & Magic
Analytics Vidhya
32 An Unsupervised ML approach using Clustering
An Unsupervised ML approach using Clustering
Analytics Vidhya
33 Customizing Large Language Models GPT3 for Real-life Use Cases #gpt3 #datascience
Customizing Large Language Models GPT3 for Real-life Use Cases #gpt3 #datascience
Analytics Vidhya
34 Model Parameters vs Hyperparameters - Techniques in ML Engineering #machinelearning
Model Parameters vs Hyperparameters - Techniques in ML Engineering #machinelearning
Analytics Vidhya
35 Practical MLOps #mlops #datascience
Practical MLOps #mlops #datascience
Analytics Vidhya
36 Data Engineering with Databricks #dataengineering #databricks
Data Engineering with Databricks #dataengineering #databricks
Analytics Vidhya
37 Multi-Objective Optimisation
Multi-Objective Optimisation
Analytics Vidhya
38 When Airflow Meets Kubernetes
When Airflow Meets Kubernetes
Analytics Vidhya
39 AI in Banking
AI in Banking
Analytics Vidhya
40 Learn Convolutional Neural Network for Image Recognition
Learn Convolutional Neural Network for Image Recognition
Analytics Vidhya
41 Extracting Value from Data
Extracting Value from Data
Analytics Vidhya
42 How to measure Marketing Channel Effectiveness
How to measure Marketing Channel Effectiveness
Analytics Vidhya
43 Transforming Lives | Data Science Immersive Bootcamp
Transforming Lives | Data Science Immersive Bootcamp
Analytics Vidhya
44 Stock Market Analysis - AI driven approach
Stock Market Analysis - AI driven approach
Analytics Vidhya
45 Become a Data Engineering Professional in 2022 | Future Trends + Skills Required
Become a Data Engineering Professional in 2022 | Future Trends + Skills Required
Analytics Vidhya
46 Ensemble Techniques in Machine Learning #machinelearning #ensemble #datascience
Ensemble Techniques in Machine Learning #machinelearning #ensemble #datascience
Analytics Vidhya
47 The Power of Visualization | Tableau Full Course | Analytics Vidhya
The Power of Visualization | Tableau Full Course | Analytics Vidhya
Analytics Vidhya
48 Demand for Data Engineers is on the Rise | Data Engineer | Analytics Vidhya
Demand for Data Engineers is on the Rise | Data Engineer | Analytics Vidhya
Analytics Vidhya
49 Data Visualization in Data Science | DataHour | Analytics Vidhya
Data Visualization in Data Science | DataHour | Analytics Vidhya
Analytics Vidhya
50 Role of Optimization in Machine Learning & Deep Learning | DataHour | Analytics Vidhya
Role of Optimization in Machine Learning & Deep Learning | DataHour | Analytics Vidhya
Analytics Vidhya
51 Solving any Machine Learning Problem | Approach and Steps Involved
Solving any Machine Learning Problem | Approach and Steps Involved
Analytics Vidhya
52 Topic Modeling Explained with Implementation | Using LDA in Python | DataHour by Arpendu Ganguly
Topic Modeling Explained with Implementation | Using LDA in Python | DataHour by Arpendu Ganguly
Analytics Vidhya
53 Data Engineering in E-Commerce | The Best Case Study
Data Engineering in E-Commerce | The Best Case Study
Analytics Vidhya
54 Introduction to Classification using Azure Machine Learning | DataHour | Analytics Vidhya
Introduction to Classification using Azure Machine Learning | DataHour | Analytics Vidhya
Analytics Vidhya
55 Introduction to Federated Learning | DataHour | Analytics Vidhya
Introduction to Federated Learning | DataHour | Analytics Vidhya
Analytics Vidhya
56 Diffusion Models for Generative Arts | DataHour | Analytics Vidhya
Diffusion Models for Generative Arts | DataHour | Analytics Vidhya
Analytics Vidhya
57 Master Google Analytics in 1 Hour | DataHour | Analytics Vidhya
Master Google Analytics in 1 Hour | DataHour | Analytics Vidhya
Analytics Vidhya
58 Learn Hypothesis Testing | DataHour | Analytics Vidhya
Learn Hypothesis Testing | DataHour | Analytics Vidhya
Analytics Vidhya
59 A Practical Approach to Kaggle Competition | DataHour | Analytics Vidhya
A Practical Approach to Kaggle Competition | DataHour | Analytics Vidhya
Analytics Vidhya
60 Making AI work for Business | DataHour | Analytics Vidhya
Making AI work for Business | DataHour | Analytics Vidhya
Analytics Vidhya

Related Reads

Chapters (10)

Introduction: Turning Data into Art with Python
1:14 Getting Started with Matplotlib & NumPy
1:35 Creating Your First Line Plot (Sine Wave)
3:27 Advanced Visualization with Seaborn
4:00 Plotting a Histogram (Distribution Plot)
4:57 Creating a Box Plot to Analyze Data Distribution
6:11 Understanding Correlation with a Heatmap
7:30 Customizing Plots: Multiple Lines & Annotations
8:36 Visualizing Time Series Data
10:40 Creating In
Up next
How to Open OSM Files (OpenStreetMap Data)
File Extension Geeks
Watch →