Pandas Profiling for Data Science (Quick and Easy Exploratory Data Analysis)
Skills:
Data Literacy80%
Key Takeaways
This video demonstrates how to use the pandas-profiling library in Python for quick and easy exploratory data analysis, allowing users to get a glimpse of their data with minimal effort. The library provides a range of statistics and visualizations, including descriptive statistics, histograms, correlation plots, and heat maps.
Full Transcript
welcome back to the data professor YouTube channel if you new here my name is Shannon nontox and Hammad and I'm an associate professor of bioinformatics on this YouTube channel we cover about data science concepts and practical tutorials so if you're into this kind of content please consider subscribing so in this video I'm going to give a short tutorial and how you can use pandas profiling in order to do exploit Ori data analysis so without further ado let's get started so the first thing that you want to do is head over to google and search for pandas and then profiling click on the first link which will go to the github so it'll be github.com slash pandas - profiling slash pandas - profiling scroll down and then find the command that will allow you to install software so I'm gonna use pip install and then I'm on the windows so I'm gonna head over to the command prompt and I'm going to activate the environment and then I'm going to install it using pip install panda's profiling notebook HTML [Music] and so this should take some time okay so it's installed so I'm gonna open up my tube here in the book ADEs a new notebook and so for this I will just show you using the example code here so what this essentially does is it will import numpy it will import pandas and then it will import pandas profiling and the function that we're going to use is the profile report and then we're going to create a data frame whereby it will use the numpy to generate random number 100 rows and five columns and the five column will comprise of ABCDE and then we're going to create a variable and the variable will be assigned the profile report function and the input argument will be the data frame and then the title of the generated report will be called pandas profiling report and then we're going to create a HTML report and the full-width will be true so that means that the HTML output will have occupied the full width of the web page so let's enter that and it's doing a thematic you and then we invoke the report by typing in profile and then here you have it the pandas profiling report and this is done automatically so it will allow you to do expert ory data analysis with minimal effort so you just have a look scroll around here it gives you the datasets statistics that there are five variables hundred rolls no missing data no duplicate data and then it has a look at each of the five variables a b c d e and for each of the variable it will give you the descriptive statistics and also the histogram and the mean minimum maximum and then you can also look at the correlation plot between each of the five variables so here we have five variables a through E and a through E so the correlation between a a will give you a perfect correlation because it is a self correlation and then you can look at the correlation between a and B a and C a and D a a and E and etc B and a B & B B and C B and D B and E right and you could look at all possible correlation okay and then this is the heat map of the Pearson's correlation matrix and also the Spearman's candles and Fick right and after the correlation look at the missing values so you see that all of the variables are containing no missing values and so here you see the ten rolls and you see the last ten rolls okay so it's very intuitive and it allows you to get a quick expert Ori data analysis of your data with minimal effort so you can see that this required only three lines of code the first one will import the necessary library generate the random data so in your case you might only import this necessary library and then you will create a data frame in which you will read in your CSV data and then after that you're going to generate the report so essentially you will create your data frame by reading in your CSV data and then after that you're going to create your report using this block of code here and then afterward you will look at your export or data analysis report by invoking on the profile command and as always the best way to learn data science is to do data science so please enjoy the journey thank you for watching please like subscribe and share and I'll see you in the next one but in the meantime please check out these videos
Original Description
In this video, I will be showing you how to use the pandas-profiling library in Python to easily and quickly perform Exploratory Data Analysis. In just a few lines of code you can get a glimpse of your data.
🌟 Buy me a coffee: https://www.buymeacoffee.com/dataprofessor
⭕ Playlist:
Check out our other videos in the following playlists.
✅ Data Science 101: https://bit.ly/dataprofessor-ds101
✅ Data Science YouTuber Podcast: https://bit.ly/datascience-youtuber-podcast
✅ Data Science Virtual Internship: https://bit.ly/dataprofessor-internship
✅ Bioinformatics: http://bit.ly/dataprofessor-bioinformatics
✅ Data Science Toolbox: https://bit.ly/dataprofessor-datasciencetoolbox
✅ Streamlit (Web App in Python): https://bit.ly/dataprofessor-streamlit
✅ Shiny (Web App in R): https://bit.ly/dataprofessor-shiny
✅ Google Colab Tips and Tricks: https://bit.ly/dataprofessor-google-colab
✅ Pandas Tips and Tricks: https://bit.ly/dataprofessor-pandas
✅ Python Data Science Project: https://bit.ly/dataprofessor-python-ds
✅ R Data Science Project: https://bit.ly/dataprofessor-r-ds
⭕ Subscribe:
If you're new here, it would mean the world to me if you would consider subscribing to this channel.
✅ Subscribe: https://www.youtube.com/dataprofessor?sub_confirmation=1
⭕ Recommended Tools:
Kite is a FREE AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite and I love it!
✅ Check out Kite: https://www.kite.com/get-kite/?utm_medium=referral&utm_source=youtube&utm_campaign=dataprofessor&utm_content=description-only
⭕ Recommended Books:
✅ Hands-On Machine Learning with Scikit-Learn : https://amzn.to/3hTKuTt
✅ Data Science from Scratch : https://amzn.to/3fO0JiZ
✅ Python Data Science Handbook : https://amzn.to/37Tvf8n
✅ R for Data Science : https://amzn.to/2YCPcgW
✅ Artificial Intelligence: The Insights You Need from Harvard Busi
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Data Professor · Data Professor · 53 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
▶
54
55
56
57
58
59
60
How a Biologist became a Data Scientist
Data Professor
WEKA Tutorial #1.1 - How to Build a Data Mining Model from Scratch
Data Professor
WEKA Tutorial #1.2 - How to Build a Data Mining Model from Scratch
Data Professor
WEKA Tutorial #1.3 - How to Build a Data Mining Model from Scratch
Data Professor
Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery
Data Professor
Quotes #1 on Big Data and Data Science
Data Professor
Quotes #2 on Big Data and Data Science
Data Professor
Quotes #3 on Big Data and Data Science
Data Professor
Quotes #4 on Big Data and Data Science
Data Professor
Quotes #5 on Big Data and Data Science
Data Professor
Data Science 101: Starting a Data Science / Data Mining Project
Data Professor
Data Science 101: CRISP-DM - Data Mining / Data Science in 6 Steps
Data Professor
R Programming 101: How to Define Variables
Data Professor
R Programming 101: Read and Write CSV files
Data Professor
Data Science 101: Basic Command-Line for Data Science
Data Professor
Strategies for Learning Data Science in 2020 (Data Science 101)
Data Professor
Building your Data Science Portfolio with GitHub (Data Science 101)
Data Professor
R Programming 101: Setting up R programming environment (R, RStudio and RStudio.cloud)
Data Professor
Exploratory Data Analysis in R: Towards Data Understanding
Data Professor
Exploratory Data Analysis in R: Quick Dive into Data Visualization
Data Professor
Machine Learning in R: Building a Classification Model
Data Professor
Machine Learning in R: Repurpose Machine Learning Code for New Data
Data Professor
Data Science 101: Deploying your Machine Learning Model
Data Professor
Machine Learning in R: Deploy Machine Learning Model using RDS
Data Professor
Data Pre-processing in R: Handling Missing Data
Data Professor
Machine Learning in R: Speed up Model Building with Parallel Computing
Data Professor
Data Science 101: Overview of Machine Learning Model Building Process
Data Professor
Web Apps in R: Building your First Web Application in R | Shiny Tutorial Ep 1
Data Professor
Web Apps in R: Build Interactive Histogram Web Application in R | Shiny Tutorial Ep 2
Data Professor
Web Apps in R: Building Data-Driven Web Application in R | Shiny Tutorial Ep 3
Data Professor
Web Apps in R: Building the Machine Learning Web Application in R | Shiny Tutorial Ep 4
Data Professor
Web Apps in R: Build BMI Calculator web application in R for health monitoring | Shiny Tutorial Ep 5
Data Professor
Machine Learning in R: Building a Linear Regression Model
Data Professor
What programming language to learn for Data Science? R versus Python
Data Professor
How to Become a Data Scientist (Learning Path and Skill Sets Needed)
Data Professor
Using Python in R
Data Professor
Interpretable Machine Learning Models
Data Professor
Making Scatter Plots in R [Data Visualisation in R series]
Data Professor
Machine Learning in Python: Building a Classification Model
Data Professor
Compare Machine Learning Classifiers in Python
Data Professor
Hyperparameter Tuning of Machine Learning Model in Python
Data Professor
Practical Introduction to Google Colab for Data Science
Data Professor
File Handling in Google Colab for Data Science
Data Professor
Pandas for Data Science: Create and Combine DataFrames / Rename Columns
Data Professor
Machine Learning in Python: Building a Linear Regression Model
Data Professor
Machine Learning in Python: Principal Component Analysis (PCA) for Handling High-Dimensional Data
Data Professor
How to Plot an ROC Curve in Python | Machine Learning in Python
Data Professor
Installing conda on Google Colab for Data Science
Data Professor
Use native R on Google Colab for Data Science
Data Professor
How to Save and Download files from Google Colab
Data Professor
Easy Web Scraping in Python using Pandas for Data Science
Data Professor
Data Science for Computational Drug Discovery using Python (Part 1)
Data Professor
Pandas Profiling for Data Science (Quick and Easy Exploratory Data Analysis)
Data Professor
Exploratory Data Analysis in Python using pandas
Data Professor
Quick tour of PyCaret (a low-code machine learning library in Python)
Data Professor
How to Upload Files to Google Colab
Data Professor
How to Install and Use Pandas Profiling on Google Colab
Data Professor
How to Adjust the Style of Pandas DataFrame
Data Professor
How to use Bamboolib for Data Wrangling in Data Science
Data Professor
How to use Pandas Profiling on Kaggle
Data Professor
More on: Data Literacy
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
How to Learn a Hard Technical Skill Without Burning Out
Dev.to · Anas Kalthoum | FreeBrain
After interviewing over 100 ML Candidates. Last Week Someone Walked In and Made Me Take Notes.
Medium · Machine Learning
How AI Learns with Less Labeled Data
Medium · Machine Learning
Mastering TypeScript — Understanding the TypeScript Compiler (tsc) from Scratch — Lesson 2
Medium · JavaScript
🎓
Tutor Explanation
DeepCamp AI