How do I apply a function to a pandas Series or DataFrame?
Key Takeaways
The video demonstrates the application of functions to pandas Series or DataFrames using methods such as map, apply, and applymap, with tools including pandas, numpy, and kaggle.
Full Transcript
hello and welcome back to my Q&A video series about the pandas library in Python and the question for today comes from a YouTube commenter who asks about two different concepts number one Lo IO and I ex when do you use each differences Etc and then second apply map and apply map okay uh I've already got a video about the first of these two uh which I will link to in the description below and that's a very important concept how to use l especially uh but today we're going to focus on the second apply map and apply map now before we get into the content I just want to let you know you should stick around to the end of this video for three important announcements okay so let's dig into the content uh as always we will import pandas as PD and we need a data set and we're going to use the training data set um from kaggle's Titanic competition okay so HTTP bit. Le Slagle train and uh let's go ahead and look at the head okay so each record in this uh data set represents a passenger on the Titanic okay and uh we're going to first go through the map method so there's map apply map and apply so we're going to start with map map is a series method okay and here's what we're going to use it for let's say that you need to create a dummy variable for sex and what that means is I want to translate sex which is male and female into one and zero okay so we're going to use map and map allows you to map an existing value of a series to a different set of values so what does that mean uh I'll show you what that uh how we do that um we're going to create a new column called sex Nom and so we'll say train. sex. map and we're going to pass it a dictionary and we're going to say female colon Z which means translate or map female to zero and map male to one okay so we'll run that and then let's compare uh sex and sexn okay so I'm going to use the Lo method uh Lo and say I want to see rows 0 through four and I want to see columns uh sex and sex num okay and what we see is that male has been translated to one and female email has been translated to zero now uh there is actually more you can do with the map method but what I have just shown you mapping values is what it's best for and so that's actually all I use it for all right so we're going to go ahead and move on to apply now apply is actually both a series method and a data frame method so we're going to start with apply as a series method okay so what does apply do it applies a function to each element in a series okay so let's see this in action let's pretend I want to calculate the length of each of these strings in the name column the length of each string and create a new column called name length that contains that integer value meaning how many characters are here and here and here okay so we're going to use apply for this and again we're going to use it as a series method so uh we're our new column is going to be called name length and we're going to say train. name. apply okay Len okay so we're applying the Len function Python's Len function which uh when applied to a string checks the length of that string okay so let's compare name and name length so use. Lo again so first four rows and we want name and name length Okay and uh what we see is that we've now have the name length column this is 23 characters this is 51 characters this is 22 characters at Etc okay so the apply series sorry the apply method um when used as a series method applies this function to this series and outputs the result okay it applies it to every element in the series now notice you do not say Len with like parentheses you just pass it the name of the function okay let me give you some more examples of of uh Ling so uh it's actually relatively common to use apply with like a numpy function okay so for instance uh let's go ahead and import nump as NP okay and here's one example I'm going to say um I'm going to look at the the uh Fair column and this is the fair and dollars or some currency and let's say I want to round it up okay so I want round this to eight round this to 72 round this to 8 and Etc okay um I'm going to use numpy sealing function c i l okay so here's how we'll do it I'm going to create a new column called Fair seal equals train. fair. apply np. seal okay so let's run that so I applied this function to this series and save the results here and again let's compare uh the two columns train. L um 0 4 and I want to look at fair and fair seal okay so it was indeed rounded up okay and there you go all right now let's use apply to solve a harder problem okay let's extract the last name of each person into its own column so what do we need to do well we need to get the part before the comma all right so how are we going to do that um you might think well we can probably do this with a string method if you've seen my video on string Methods so um we're going to say train. name do stir meaning string do split and we're going to split on commas and I'll go ahead and put head okay so I'm splitting name on commas all right now this looks very similar to what we saw up here uh Bron comma Mr Owen Harris Brun comma Mr Owen Harris but they are actually very different and this is very important okay so up there it was a string this has now split it into a list of strings so the series that is output is a is a series of python lists each list is uh made up of strings so in other words this is a list of length two and here's the first element comma here's the second element it actually includes a space right there okay so that comma is not like the comma in the string that is a comma separating list element all right so we've successfully split it but all we want is the first part so how do we just get the first part what we really need to do is say hey pandas I want you to take this result and I want to pull out the first list element from each series element so I'm going to actually write a function to do this and then we're going to to apply it okay so I'm going to call it uh git element and you pass it a list called my list and a position okay and what happens it's simple we're going to say return my list bracket position so if I pass it a list and I say position zero it will return to me element zero from that list okay so what are we going to do now now well we're going to take this and then we're going to say do apply get element and and say position equals z so I'm saying pandas take this series apply this function to the result on every element and pass it keyword argument position equals z so pass position equals z to this function I've created okay and we'll put head on the end okay and that actually does it here is my series of strings that is just the last names of these people now uh you might be thinking um we could actually do this with a Lambda function if you're familiar with Lambda functions and so that's what I'll do I'll rewrite this you don't actually need to make your own function called get element for something so simple uh you can actually rewrite this as a Lambda function so Lambda X colon X bracket 0 okay and that'll do the same thing um if you are familiar with Lambda functions uh this will be pretty clear and Lambda functions are actually used a lot with apply methods okay so that is that is it with apply as a series method now let's move on to apply as a data frame method okay so we're for this we're going to use another data set and uh I'm going to say drinks equals pd. read CSV and bit. Le SL drinks by country okay and uh what we're looking at is a data set of alcohol consumption by country okay so apply as a data frame method what does it do it applies a function along either axis of a data frame okay so I'm going to actually use a subset of this data frame I'm going to say drinks. l i want uh all columns and oh sorry all rows and I want the beer servings through wine servings columns okay so this is is the data frame we are working with okay so what am I going to do if I do do apply here I'm using it as a data frame method not a series method I'm applying it to the entire data frame so what I'm I'm going to say I'm going to say apply Max axis equals z now what am I saying I'm saying uh I want the apply method to travel on axis zero which is this direction the down Direction C okay and I want you to apply the max function Python's Max function okay so apply it in this direction okay and here's what results beer servings 376 Spirit 438 wine 370 so it figured out the max value in each of those columns because it was operating over axis zero okay so uh now we're going to change it to axis equal 1 okay and that will be I want the max value in each row because axis equals 1 goes this direction okay so now we see the max value in those three columns is 0 132 25 312 so those are the results we are seeing here okay uh and I will show you one other trick um I use uh quite quite a bit which is um sometimes you don't care about the maximum value in a row but you want to know which column is the maximum this is where uh np. argmax is super useful so I run that and this tells me that um well there is no which is largest for Afghanistan well um any of them but it'll default to the first so it says beer servings is the largest here Spirit servings is is the largest for the second beer servings is the largest for the third wine servings for the fourth Etc so that's a neat little trick you can do with np. argmax okay all right uh finally let's get into apply map so we covered map which is a series method we covered apply as a series method and we covered apply as a data frame method so now let's get to apply map which is all Al a data frame method okay so um I'm just going to show you one example here and uh what apply map does so let me change that apply map what apply map does is to apply a function to every element of a data frame okay every element of a data frame okay it doesn't go this direction or this direction it applies it to every element okay so for instance if I do apply map float it will change every element in this data frame to a floating point it's actually starting out as an integer changes to a floating Point okay now you can actually use this to overwrite the existing data frame columns okay um so I'll just show you if uh if you just say um what we want is just just drinks. L all this stuff equals drinks. Lo the rest then um we can do drink side head and we will see that now the data frame has changed from integers to floating Point numbers and uh we have accomplished our objective in that case Okay so uh there's no bonus for today but as I said I have three important announcements for you number one uh this is actually going to be the last video in the pandas Q&A series for now okay now why is that well each video takes me between 4 and 8 hours to create uh and I have some other projects to focus on right now so I'm not able to devote that time for the time being uh however I might be creating more videos in this series later on so please keep asking questions please post in comments I will be happy to interact with you there and keep answering your questions that way okay so uh announcement number two uh I have other videos you can watch so if you didn't know this is video number 30 in this series so there's 29 other videos uh I have a video series called introduction to machine learning with psychic learn that's almost 7 hours long um I have a very popular series called introduction to git and GitHub um and I have some other videos I will link to all of this in the description below this video okay and finally uh third announcement um if you didn't know I teach online courses in data science and machine learning with python okay if you like this video series you will love my courses so what do I want you to do I would encourage you to sign up for my email newsletter right now okay there's a link in the description below for how to sign up uh what do you get by signing up number one you'll get priority access to my future courses uh number two you'll get access to exclusive content I don't share publicly and a lot of other stuff okay so please do sign up for my newsletter um that is it so thank you so much for joining me for this entire series uh it has been a real pleasure teaching you um so take care and I hope to see you again soon
Original Description
Have you ever struggled to figure out the differences between apply, map, and applymap? In this video, I'll explain when you should use each of these methods and demonstrate a few common use cases. Watch the end of the video for three important announcements!
Subscribe to the Data School email newsletter: http://www.dataschool.io/subscribe/
Join "Data School Insiders" for exclusive rewards: https://www.patreon.com/dataschool
== DATA SCHOOL VIDEO TUTORIALS ==
Data analysis with pandas (30 videos): https://www.youtube.com/playlist?list=PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y
Machine learning with scikit-learn (10 videos): https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A
Version control with Git and GitHub (11 videos): https://www.youtube.com/playlist?list=PL5-da3qGB5IBLMp7LtN8Nc3Efd4hJq0kD
== PANDAS RESOURCES ==
GitHub repository for the series: https://github.com/justmarkham/pandas-videos
Series "map" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html
Series "apply" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.apply.html
DataFrame "apply" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html
DataFrame "applymap" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.applymap.html
== RELATED PANDAS VIDEOS ==
loc, iloc, and ix: https://www.youtube.com/watch?v=xvpNA7bC8cs&list=PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y&index=19
string methods: https://www.youtube.com/watch?v=bofaC0IckHo&list=PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y&index=12
== JOIN THE DATA SCHOOL COMMUNITY ==
Blog: http://www.dataschool.io
Newsletter: http://www.dataschool.io/subscribe/
Twitter: https://twitter.com/justmarkham
Facebook: https://www.facebook.com/DataScienceSchool/
YouTube: https://www.youtube.com/user/dataschool?sub_confirmation=1
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Data School · Data School · 52 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
▶
53
54
55
56
57
58
59
60
Setting up Git and GitHub
Data School
Navigating a GitHub Repository - Part 1
Data School
Forking a GitHub Repository
Data School
Creating a New GitHub Repository
Data School
Copying a GitHub Repository to Your Local Computer
Data School
Committing Changes in Git and Pushing to a GitHub Repository
Data School
Syncing Your GitHub Fork
Data School
Allstate Purchase Prediction Challenge on Kaggle
Data School
Troubleshooting: Updates Rejected When Pushing to GitHub
Data School
Hands-on dplyr tutorial for faster data manipulation in R
Data School
ROC Curves and Area Under the Curve (AUC) Explained
Data School
Going deeper with dplyr: New features in 0.3 and 0.4 (tutorial)
Data School
What is machine learning, and how does it work?
Data School
Setting up Python for machine learning: scikit-learn and Jupyter Notebook
Data School
Getting started in scikit-learn with the famous iris dataset
Data School
Training a machine learning model with scikit-learn
Data School
Comparing machine learning models in scikit-learn
Data School
Data science in Python: pandas, seaborn, scikit-learn
Data School
Selecting the best model in scikit-learn using cross-validation
Data School
How to find the best model parameters in scikit-learn
Data School
How to evaluate a classifier in scikit-learn
Data School
What is pandas? (Introduction to the Q&A series)
Data School
How do I read a tabular data file into pandas?
Data School
How do I select a pandas Series from a DataFrame?
Data School
Why do some pandas commands end with parentheses (and others don't)?
Data School
How do I rename columns in a pandas DataFrame?
Data School
How do I remove columns from a pandas DataFrame?
Data School
How do I sort a pandas DataFrame or a Series?
Data School
How do I filter rows of a pandas DataFrame by column value?
Data School
How do I apply multiple filter criteria to a pandas DataFrame?
Data School
Your pandas questions answered!
Data School
How do I use the "axis" parameter in pandas?
Data School
How do I use string methods in pandas?
Data School
How do I change the data type of a pandas Series?
Data School
When should I use a "groupby" in pandas?
Data School
How do I explore a pandas Series?
Data School
How do I handle missing values in pandas?
Data School
What do I need to know about the pandas index? (Part 1)
Data School
What do I need to know about the pandas index? (Part 2)
Data School
How do I select multiple rows and columns from a pandas DataFrame?
Data School
Machine Learning with Text in scikit-learn (PyCon 2016)
Data School
When should I use the "inplace" parameter in pandas?
Data School
How do I make my pandas DataFrame smaller and faster?
Data School
How do I use pandas with scikit-learn to create Kaggle submissions?
Data School
More of your pandas questions answered!
Data School
How do I create dummy variables in pandas?
Data School
How do I work with dates and times in pandas?
Data School
How do I find and remove duplicate rows in pandas?
Data School
How do I avoid a SettingWithCopyWarning in pandas?
Data School
How do I change display options in pandas?
Data School
How do I create a pandas DataFrame from another object?
Data School
How do I apply a function to a pandas Series or DataFrame?
Data School
Getting started with machine learning in Python (webcast)
Data School
Q&A about Machine Learning with Text (online course)
Data School
Your pandas questions answered! (webcast)
Data School
Machine Learning with Text in scikit-learn (PyData DC 2016)
Data School
Write Pythonic Code for Better Data Science (webcast)
Data School
Web scraping in Python (Part 1): Getting started
Data School
Web scraping in Python (Part 2): Parsing HTML with Beautiful Soup
Data School
Web scraping in Python (Part 3): Building a dataset
Data School
More on: LLM Foundations
View skill →
🎓
Tutor Explanation
DeepCamp AI