Data Slicing in Python with Mito

Data Professor · Beginner ·🛠️ AI Tools & Apps ·4y ago

Key Takeaways

The video demonstrates how to use Mito, a Python package, to slice and dice data in a Jupyter notebook, using a Netflix dataset as an example. It covers installing Mito, importing data, creating pivot tables, and generating equivalent Python code.

Full Transcript

hey this is jake from mido thanks for checking out this video i'm going to show you how you can really easily slice and dice and transform your data using the mito python package so the first thing we're going to do is run these two lines of code here so import miter sheet and my do sheet dot sheet and that's going to render our mito front end so the miter front end what it is is it's a spreadsheet front end for python inside the jupyter notebook so every edit we make in this front end here is going to generate the equivalent python below the first thing let's do is just get our data in so we can get data in one of two ways one is we can just search our local files here i'm going to decide this uh netflix csv file so with some data about the different things you can watch on netflix so we'll do that and we see when we do that it populates the miter sheet with our data from this netflix csv file and below it generates the code that's turned that csv into a data frame the other way we can get data into the tool is we can pass in a data frame directly so you can call in the miter sheet at any point your analysis if you're working with data frames above all you have to do is pass in the name of the data frame as an argument to this minor sheet.sheetcall and it'll populate the sheet before we move forward just ways you can actually analyze and understand your data using mido let me just show you how you can install it really quickly so all you have to do is these three commands here from our documentation we're going to install the mito installer then run the install command from within the installer and then just open jupyter lab and you're good to go so back in mido we have some data that is about the different titles the different things you can watch on netflix so there's tv shows and movies obviously let's look at some summary statistics really quickly for the breakdown between tv shows and movies just to understand our data better i'll go to the summary stats tab here we can see we have a good amount more movies we see the exact number there five three seven seven uh more movies and tv shows we can see and the other data point i'm interested in here is the rating so what these movies and tv shows are rated again i'm going to look at the summary stats for those here we can see tvma tv 14 and tvpg are the most common ratings and again we can see the exact values for each of them on those labels that pop up so what i want to do here now is just understand the relationship a bit better to really slice and dice my data in a way that gives me some insights so i'm going to do a pivot table which is a great way to do that i'll click pivot first thing i can do here is i'm going to add the ratings as my row and as the columns i'm not going to put the type so it's either going to be tv show or movie and as the value i'll just select ratings again and just put the aggregation type as count and what that does there is it populates this here so i'll close this pivot table now and now we've made this really great um editable updating pivot table inside my own so we can see for example the rating tv 14 the netflix profile has 1272 movies that have that rating and 659 tv shows that have that rating and again as i said before everything we do here is generating the equivalent code below so when i make this pivot table below it generates the code that is the that represents this exact same pivot table so if we were to just do this by hand it would generate this same pivot table here um and what i can do here is really nice about the generated code is i can use that carry forward in my analysis so this is all real code i can use so i can run this cell here and now this pivot table is called df2 as we can see here if i were to print out df2 right here we would see the same pivot table that we have in the cell above here another thing i might want to do to really understand my data and you know help transform and help decide how i want to analyze more is put a graph on top of this represent this graphically so what i can do here is just click our graph button here let's just look at the ratings of the movies versus the different ratings so i'm going to put ratings on the x-axis and on the y-axis i'm going to put the movie ratings here and we can see oh oops i'll just closed that let me do it again x-axis rating y-axis movie and we can see here tbma for the movies is the most common rating tv 14 second most common tv are third most common and tv pg fourth most common one thing i can do here just if i want to zoom in and look at a subset of the data get a better understanding of the compare and contrast here i can zoom in on a smaller set and then all i have to do is double tap and get back out to the larger data set so let's go back to the base data set and look at another relationship understand our data even more slice and dice it up a little bit more we see here we have the different countries these are the countries that the data sets are coming from in the bottom right corner i can see here i have 7787 rows in here and now if i apply a filter to country and let's say i just want to look at the ones that are from brazil so i'll do contains brazil we can see we filtered our data set down just to the brazil values or where where or values that contain brazil in it and below if i go here we see we've generated the equivalent code for that filter and we'll also see that there are 88 values left in the data set so we can see that we've shrunk the data set a good amount by filtering down to that filtering down to the brazil values i can remove that filter as well so let me remove that here and we're back to the base data set and let's look let's say i want to look at i want to see how many values how many movies or slash tv shows i have from each country all i have to do is pivot again actually i'll go back to this pivot table here and i can edit it so i'm going to change this here to to country i'm going to get rid of this and i'm going to change this to country and count and now here i can see for each country what is the amount of titles we have from there and if i want to make this a bit easier to view i'll hit change this to descending order and so now i can see united states has the most india the second most united kingdom the third most et cetera et cetera and below we have again the code for that pivot table and we have the code for that sword we did that sorting and descending order um so there's a lot of great analysis we can do in the tool mito is a really great way to slice and dice your data look at different patterns look at different behaviors in the data and understand how you want to proceed with the data thanks for checking out the video hope you really enjoyed it

Original Description

In this video, Jake (co-Founder of Mito) will give us a step-by-step guide on using Mito to slice and dice data in Python. Particularly, he’ll be using the Netflix data for this tutorial. The great thing about Mito is that the operations performed using the graphical user interface will also generate the underlying Python code that you can reuse. 🌟 Join as a Member to support this Channel: https://www.youtube.com/channel/UCV8e2g4IWQqK71bbzGDEI4Q/join 🌟 Download Kite for FREE https://www.kite.com/get-kite/?utm_medium=referral&utm_source=youtube&utm_campaign=dataprofessor&utm_content=description-only Timeline 0:00 Introduction 0:30 Getting data in 1:18 Installing Mito 1:31 Analyzing the Netflix data 2:18 Creating a Pivot table 3:40 Creating plots and graphs 4:31 Understanding the data using filters ⭕Links for this video: Mito Installation Instructions: https://docs.trymito.io/getting-started/installing-mito Mito YouTube Channel: https://youtube.com/channel/UCN9o_0m1fwCjigfIpnKr0oA ⭕ Watch this video next: - How to Master Python for Data Science https://youtu.be/AeUnO1oNv08 ⭕ Support my work: 🌟 Subscribe to the Coding Professor channel https://www.youtube.com/channel/UCJzlfIoF8nmWqJIv_iWQVRw?sub_confirmation=1 🌟 Subscribe to the Data Professor https://www.youtube.com/dataprofessor?sub_confirmation=1 🌟 Join the Newsletter of Data Professor http://newsletter.dataprofessor.org 🌟 Buy me a coffee https://www.buymeacoffee.com/dataprofessor ⭕ Recommended Books: 🌟https://kit.co/dataprofessor ✅ Python Basics: A Practical Introduction to Python 3 https://amzn.to/3awdWgm ✅ Learn Python Programming (The no-nonsense, beginner's guide) https://amzn.to/2RFpSpn ✅ Learn to Program with Minecraft https://amzn.to/3x2MujZ ✅ Automate the Boring Stuff with Python, 2nd Edition: Practical Programming for Total Beginners https://amzn.to/2QzkyDs ⭕ Disclaimer: Recommended books and tools are affiliate links that gives me a portion of sales at no cost to you, which will contribu
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Data Professor · Data Professor · 0 of 60

← Previous Next →
1 How a Biologist became a Data Scientist
How a Biologist became a Data Scientist
Data Professor
2 WEKA Tutorial #1.1 - How to Build a Data Mining Model from Scratch
WEKA Tutorial #1.1 - How to Build a Data Mining Model from Scratch
Data Professor
3 WEKA Tutorial #1.2 - How to Build a Data Mining Model from Scratch
WEKA Tutorial #1.2 - How to Build a Data Mining Model from Scratch
Data Professor
4 WEKA Tutorial #1.3 - How to Build a Data Mining Model from Scratch
WEKA Tutorial #1.3 - How to Build a Data Mining Model from Scratch
Data Professor
5 Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery
Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery
Data Professor
6 Quotes #1 on Big Data and Data Science
Quotes #1 on Big Data and Data Science
Data Professor
7 Quotes #2 on Big Data and Data Science
Quotes #2 on Big Data and Data Science
Data Professor
8 Quotes #3 on Big Data and Data Science
Quotes #3 on Big Data and Data Science
Data Professor
9 Quotes #4 on Big Data and Data Science
Quotes #4 on Big Data and Data Science
Data Professor
10 Quotes #5 on Big Data and Data Science
Quotes #5 on Big Data and Data Science
Data Professor
11 Data Science 101: Starting a Data Science / Data Mining Project
Data Science 101: Starting a Data Science / Data Mining Project
Data Professor
12 Data Science 101: CRISP-DM - Data Mining / Data Science in 6 Steps
Data Science 101: CRISP-DM - Data Mining / Data Science in 6 Steps
Data Professor
13 R Programming 101: How to Define Variables
R Programming 101: How to Define Variables
Data Professor
14 R Programming 101: Read and Write CSV files
R Programming 101: Read and Write CSV files
Data Professor
15 Data Science 101: Basic Command-Line for Data Science
Data Science 101: Basic Command-Line for Data Science
Data Professor
16 Strategies for Learning Data Science in 2020 (Data Science 101)
Strategies for Learning Data Science in 2020 (Data Science 101)
Data Professor
17 Building your Data Science Portfolio with GitHub (Data Science 101)
Building your Data Science Portfolio with GitHub (Data Science 101)
Data Professor
18 R Programming 101: Setting up R programming environment (R, RStudio and RStudio.cloud)
R Programming 101: Setting up R programming environment (R, RStudio and RStudio.cloud)
Data Professor
19 Exploratory Data Analysis in R: Towards Data Understanding
Exploratory Data Analysis in R: Towards Data Understanding
Data Professor
20 Exploratory Data Analysis in R: Quick Dive into Data Visualization
Exploratory Data Analysis in R: Quick Dive into Data Visualization
Data Professor
21 Machine Learning in R: Building a Classification Model
Machine Learning in R: Building a Classification Model
Data Professor
22 Machine Learning in R: Repurpose Machine Learning Code for New Data
Machine Learning in R: Repurpose Machine Learning Code for New Data
Data Professor
23 Data Science 101: Deploying your Machine Learning Model
Data Science 101: Deploying your Machine Learning Model
Data Professor
24 Machine Learning in R: Deploy Machine Learning Model using RDS
Machine Learning in R: Deploy Machine Learning Model using RDS
Data Professor
25 Data Pre-processing in R: Handling Missing Data
Data Pre-processing in R: Handling Missing Data
Data Professor
26 Machine Learning in R: Speed up Model Building with Parallel Computing
Machine Learning in R: Speed up Model Building with Parallel Computing
Data Professor
27 Data Science 101: Overview of Machine Learning Model Building Process
Data Science 101: Overview of Machine Learning Model Building Process
Data Professor
28 Web Apps in R: Building your First Web Application in R | Shiny Tutorial Ep 1
Web Apps in R: Building your First Web Application in R | Shiny Tutorial Ep 1
Data Professor
29 Web Apps in R: Build Interactive Histogram Web Application in R | Shiny Tutorial Ep 2
Web Apps in R: Build Interactive Histogram Web Application in R | Shiny Tutorial Ep 2
Data Professor
30 Web Apps in R: Building Data-Driven Web Application in R | Shiny Tutorial Ep 3
Web Apps in R: Building Data-Driven Web Application in R | Shiny Tutorial Ep 3
Data Professor
31 Web Apps in R: Building the Machine Learning Web Application in R | Shiny Tutorial Ep 4
Web Apps in R: Building the Machine Learning Web Application in R | Shiny Tutorial Ep 4
Data Professor
32 Web Apps in R: Build BMI Calculator web application in R for health monitoring | Shiny Tutorial Ep 5
Web Apps in R: Build BMI Calculator web application in R for health monitoring | Shiny Tutorial Ep 5
Data Professor
33 Machine Learning in R: Building a Linear Regression Model
Machine Learning in R: Building a Linear Regression Model
Data Professor
34 What programming language to learn for Data Science? R versus Python
What programming language to learn for Data Science? R versus Python
Data Professor
35 How to Become a Data Scientist (Learning Path and Skill Sets Needed)
How to Become a Data Scientist (Learning Path and Skill Sets Needed)
Data Professor
36 Using Python in R
Using Python in R
Data Professor
37 Interpretable Machine Learning Models
Interpretable Machine Learning Models
Data Professor
38 Making Scatter Plots in R [Data Visualisation in R series]
Making Scatter Plots in R [Data Visualisation in R series]
Data Professor
39 Machine Learning in Python: Building a Classification Model
Machine Learning in Python: Building a Classification Model
Data Professor
40 Compare Machine Learning Classifiers in Python
Compare Machine Learning Classifiers in Python
Data Professor
41 Hyperparameter Tuning of Machine Learning Model in Python
Hyperparameter Tuning of Machine Learning Model in Python
Data Professor
42 Practical Introduction to Google Colab for Data Science
Practical Introduction to Google Colab for Data Science
Data Professor
43 File Handling in Google Colab for Data Science
File Handling in Google Colab for Data Science
Data Professor
44 Pandas for Data Science: Create and Combine DataFrames / Rename Columns
Pandas for Data Science: Create and Combine DataFrames / Rename Columns
Data Professor
45 Machine Learning in Python: Building a Linear Regression Model
Machine Learning in Python: Building a Linear Regression Model
Data Professor
46 Machine Learning in Python: Principal Component Analysis (PCA) for Handling High-Dimensional Data
Machine Learning in Python: Principal Component Analysis (PCA) for Handling High-Dimensional Data
Data Professor
47 How to Plot an ROC Curve in Python | Machine Learning in Python
How to Plot an ROC Curve in Python | Machine Learning in Python
Data Professor
48 Installing conda on Google Colab for Data Science
Installing conda on Google Colab for Data Science
Data Professor
49 Use native R on Google Colab for Data Science
Use native R on Google Colab for Data Science
Data Professor
50 How to Save and Download files from Google Colab
How to Save and Download files from Google Colab
Data Professor
51 Easy Web Scraping in Python using Pandas for Data Science
Easy Web Scraping in Python using Pandas for Data Science
Data Professor
52 Data Science for Computational Drug Discovery using Python (Part 1)
Data Science for Computational Drug Discovery using Python (Part 1)
Data Professor
53 Pandas Profiling for Data Science (Quick and Easy Exploratory Data Analysis)
Pandas Profiling for Data Science (Quick and Easy Exploratory Data Analysis)
Data Professor
54 Exploratory Data Analysis in Python using pandas
Exploratory Data Analysis in Python using pandas
Data Professor
55 Quick tour of PyCaret (a low-code machine learning library in Python)
Quick tour of PyCaret (a low-code machine learning library in Python)
Data Professor
56 How to Upload Files to Google Colab
How to Upload Files to Google Colab
Data Professor
57 How to Install and Use Pandas Profiling on Google Colab
How to Install and Use Pandas Profiling on Google Colab
Data Professor
58 How to Adjust the Style of Pandas DataFrame
How to Adjust the Style of Pandas DataFrame
Data Professor
59 How to use Bamboolib for Data Wrangling in Data Science
How to use Bamboolib for Data Wrangling in Data Science
Data Professor
60 How to use Pandas Profiling on Kaggle
How to use Pandas Profiling on Kaggle
Data Professor

This video teaches how to use Mito to analyze and transform data in a Jupyter notebook. It covers the basics of Mito, including installing the package, importing data, and creating pivot tables.

Key Takeaways
  1. Install Mito using the provided commands
  2. Import the Netflix dataset into Mito
  3. Create a pivot table to analyze the relationship between movie ratings and types
  4. Generate equivalent Python code for the pivot table
  5. Filter the data by country and generate equivalent Python code
💡 Mito provides a graphical user interface for data analysis and transformation, allowing users to create pivot tables and generate equivalent Python code.

Related AI Lessons

Chapters (7)

Introduction
0:30 Getting data in
1:18 Installing Mito
1:31 Analyzing the Netflix data
2:18 Creating a Pivot table
3:40 Creating plots and graphs
4:31 Understanding the data using filters
Up next
I Asked ChatGPT to Apply to 500 Jobs (8 Interviews in 48 Hours)
Sabrina Ramonov 🍄
Watch →