Python Data Science Tutorial #16 - Pandas Merging Data Frames
Skills:
ML Pipelines80%
Key Takeaways
The video teaches how to merge data frames like SQL tables using Pandas in Python, a crucial skill for data science tasks.
Full Transcript
what is going on guys from welcome to despite in the Tory series for data science in today's video we're going to learn how to merge data frames together how to join them together like SQL tables you can join SQL tables together how do we do that in pandas and what different types of joints or merges do we have so let us get into the code so as always we start by importing pandas as PD and now what we're going to do is we're going to create two separate data frames that have the same column social security number but we're going to create one data frame with the names and one data frame with the ages of a person so we're going to say names equals now we're going to create a dictionary and here we have the social security number and we're going to choose I don't know two five seven and eight and then we have the name of the person and the name is Ana bop John and Mike so these are the names and now I'm going to create the ages with the same social security number call Youm but different values and this is the key point here because we're going to join them together to us to to a new data frame that contains names and ages but we have to have the same index column or one column that's the same so that we can join them together on this column and what I'm going to do here is I'm not going to have the exact same social security numbers I'm going to have different social security numbers so some are going to overlap some are not going to overlap I'm going to start with one two which overlaps then maybe three and then five so these are the social security numbers here and I'm now going to say H and now let's define some age of 28 34 45 62 these are now or two dictionaries we're just going to convert them into data frames real quick so DF 1 equals P D dot data frame names in DF 2 equals PD dot data frame ages and now what we're going to do is we're going to create a new data frame DF that contains both these values merge together in one data frame and to do that of course we just say DF equals P dot merge and that's the function that we use to merge two data frames into one data frame to join them together now of course we have to specify the two data frames here so we say DF 1 and DF 2 but besides that we also need to specify on which column we're going to merge them and also how we're going to merge them and the column is obvious because we have the social security number as the index column as the column that both data frames have so we're going to say on equals as this n and now it gets tricky or not tricky but now we can choose a lot of different or actually four different ways to merge these data frames together so we have a left join an inner join and outer join and a right join so basically what we're saying is some of the security numbers social security numbers that we have here we don't have here and also the other way around so which one are we going to neglect out which one are we going to neglect how many of them are we going to throw out are we going to take all of them are we only going to take those values that are contained in both dictionaries or data frames which ones are we going to look at and now if I say how equals outer for example also known as the full join outer join full join what happens here is I basically say take all of them take one two three five seven and eight and just display all the values so if I just go ahead and say of course I would have to do set index here so set index as this N in place equals true so that'll be happen index because it's not automatically the case and now we're going to print a data frame what you're going to see is that we have all the values of course they're not sorted but we have all the individual values and they're where we have in the cases that we have both Social Security numbers and both data frames are what happens is that the values get linked together so anna has the social security number too and also the age 34 so it's one row now if we have John for example that has not which has the or who has the Social Security number seven he does not occur in the ages so we just say nan for not a number and an outer join basically does exactly that we take all the values in the where values are missing we just fill up with Nan's with nada numbers also for the ages of course we have the age of 28 which is the social security number one but we don't have a name for that so this would be an outer join the opposite would be an inner join an inner join would only give us the columns or duros actually where we have all the information so two and five basically because two and five for Social Security numbers that occur in the first dictionary and also in the second dictionary so an inner join only gives us the rows that overlap now another way to do that would be to say left or right join so left join would take all from the first column or actually the first dictionary sorry and then add up or fill it up with the right column and then fill the empty values with nan so what we're doing here is we take all the names so all these forces Social Security numbers it doesn't matter if they occur here we take all of these and then if they occur here we fill them up with the values and otherwise we just set them to nen the opposite would be the right joint just taking all of these here and filling up with these values so here we would have all the ages but not all the names as you can see one and three have no names so that's basically the right join and that's how you merge data frames and pandas so that's it for today's video I hope you learned something I hope you enjoyed it if so hit the like button to support this channel and see future videos for free also feel free to ask questions and give feedback in the comment section down below and of course subscribe to this channel if you want to see more in the future so thank you very much for watching see you in the next video and bye [Music]
Original Description
In today's episode we learn how to merge data frames like SQL tables.
Website: https://www.neuralnine.com/
Instagram: https://www.instagram.com/neuralnine
Twitter: https://twitter.com/neuralnine
GitHub: https://github.com/NeuralNine
Programming Books: https://www.neuralnine.com/books/
Outro Music From: https://www.bensound.com/
Subscribe and Like for more free content!
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from NeuralNine · NeuralNine · 38 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
▶
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Visualizing Stock Data With Candlestick Charts in Python
NeuralNine
Python Beginner Tutorial #1 - Installation and First Program
NeuralNine
Python Beginner Tutorial #2 - Variables and Data Types
NeuralNine
Python Beginner Tutorial #3 - Operators and User Input
NeuralNine
Python Beginner Tutorial #4 - If Statements and Conditions
NeuralNine
Python Beginner Tutorial #5 - Loops
NeuralNine
Python Beginner Tutorial #6 - Sequences and Collections
NeuralNine
Python Beginner Tutorial #7 - Functions
NeuralNine
Python Beginner Tutorial #8 - Exception Handling
NeuralNine
Python Beginner Tutorial #9 - File Operations
NeuralNine
Python Beginner Tutorial #10 - String Functions
NeuralNine
Python Intermediate Tutorial #1 - Classes and Objects
NeuralNine
Python Intermediate Tutorial #2 - Inheritance
NeuralNine
Python Intermediate Tutorial #3 - Multithreading
NeuralNine
Python Intermediate Tutorial #4 - Synchronizing Threads
NeuralNine
Python Intermediate Tutorial #5 - Events and Daemon Threads
NeuralNine
Python Intermediate Tutorial #6 - Queues
NeuralNine
Python Intermediate Tutorial #7 - Sockets and Network Programming
NeuralNine
Python Intermediate Tutorial #8 - Database Programming
NeuralNine
Python Intermediate Tutorial #9 - Recursion
NeuralNine
Python Intermediate Tutorial #10 - XML Processing
NeuralNine
Python Intermediate Tutorial #11 - Logging
NeuralNine
Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
NeuralNine
Python Data Science Tutorial #2 - NumPy Arrays
NeuralNine
Python Data Science Tutorial #3 - Numpy Functions
NeuralNine
Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
NeuralNine
Python Data Science Tutorial #5 - Subplots and Multiple Windows
NeuralNine
Python Data Science Tutorial #6 - Matplotlib Styling
NeuralNine
Python Data Science Tutorial #7 - Bar Charts with Matplotlib
NeuralNine
Python Data Science Tutorial #8 - Pie Charts with Matplotlib
NeuralNine
Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
NeuralNine
Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
NeuralNine
Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
NeuralNine
Python Data Science Tutorial #12 - Pandas Series
NeuralNine
Python Data Science Tutorial #13 - Pandas Data Frames
NeuralNine
Python Data Science Tutorial #14 - Pandas Statistics
NeuralNine
Python Data Science Tutorial #15 - Pandas Sorting and Functions
NeuralNine
Python Data Science Tutorial #16 - Pandas Merging Data Frames
NeuralNine
Python Data Science Tutorial #17 - Pandas Queries
NeuralNine
Python Machine Learning Tutorial #1 - What is Machine Learning?
NeuralNine
Python Machine Learning Tutorial #2 - Linear Regression
NeuralNine
Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
NeuralNine
Python Machine Learning #4 - Support Vector Machines
NeuralNine
Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
NeuralNine
Python Machine Learning Tutorial #6 - K-Means Clustering
NeuralNine
Python Machine Learning Tutorial #7 - Neural Networks
NeuralNine
Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
NeuralNine
Generating Poetic Texts with Recurrent Neural Networks in Python
NeuralNine
Stock Portfolio Visualization with Matplotlib in Python
NeuralNine
Analyzing Coronavirus with Python (COVID-19)
NeuralNine
Making Text Images Readable Again with Python and OpenCV
NeuralNine
Neural Networks Simply Explained (Theory)
NeuralNine
Motion Filtering with OpenCV in Python
NeuralNine
Top 5 Programming Languages To Learn in 2020
NeuralNine
Simple TCP Chat Room in Python
NeuralNine
Image Classification with Neural Networks in Python
NeuralNine
Edge Detection with OpenCV in Python
NeuralNine
S&P 500 Web Scraping with Python
NeuralNine
Simple Sentiment Text Analysis in Python
NeuralNine
Introduction - Algorithms & Data Structures #1
NeuralNine
More on: ML Pipelines
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
You’re Still Paying $200/Month for AI Tools You Could Replace With a Free Local Setup Tonight
Medium · Data Science
Top 10 AI Tools Every College Student Should Know in 2026
Medium · AI
The Future of Technical Education: AI, Projects, and Industry Collaboration
Dev.to AI
I Asked Gemini AI to Preview My Haircut Before My Salon Appointment - Here’s What Happened
Medium · AI
🎓
Tutor Explanation
DeepCamp AI