How to use Pandas in Python | Python Pandas Tutorial | Python Tutorial | Edureka Rewind
Key Takeaways
Using Pandas in Python for data analysis
Full Transcript
hello everyone this is Wasim from Eda and I welcome you all to this live session in which I'm going to talk about python pandas Library so I'm sure most of you have heard about the pandas library in Python and it is an integral part of data analysis and serves as the building block of data analysis and data science so let us take a look at the agenda for this session but before that let me get a quick confirmation if you guys can hear me or not so if I'm audible to you guys please type yes in the chat box and now that I can see a lot of confirmation let us go ahead and take a look at the agenda for this session so first of all I'm going to start with the basic introduction to pandas and python follow followed by data frames and series with examples and then we will see how we explore the data using pandas moving further we will learn pandas operations merging grouping reaping Etc and then I will discuss time series and categorical data after this I will talk about plotting with pandas and finally to sum up this session I will tell you about reading and writing files using pandas I hope you are clear with the agenda so let us go ahead and take a look at what exactly is pandas so pandas is a python Library which is used for data manipulation analysis and cleaning and python pandas is well suited for different kinds of data such as we can work on table data with heterogeneously typed columns we can work on ordered and unordered time series data arbitrary Matrix data with rows and column labels we can work on unlabeled data and we can also work on any other form of observational or statistical data sets now I'm going to tell you how you can install pandas on your systems guys it's very easy to install python pandas you just go to your command line or terminal and just type pip install pandas or if you're working on an IDE such as pycharm you can just simply type in PIP install pandas in your terminal over there or you can just open the project interpreter and add the library over there since we're going to work on jupyter Notebook I'm going to tell you how you can install python pandas on Anaconda so you just have to do one thing so I'll just show you guys how you install python Pand on your system you open the Anaconda prompt we'll wait for the to set up so you type cond install it's already there in my system because I have already installed pandas since I have already worked on various data analysis projects and it's a very integral part of it because to work on a data set to read a data set you require pandas and it's just that you cannot work without pandas if you are working with any data related project so this is how important python pandas actually is I'm going to tell you a few applications of pandas as well so first of all you can just say that python pandas is an integral part of data whichever project you're working on so you can work on economics you can use Python pandas for stock prediction you can use it for recommendation systems then you can use it on neuroscience and statistics also you can use it for and then there is advertising so many data coming from different platforms you can just analyze the data using pandas you know clean the data for irrelevant you know inconsistencies in your data you can do that using pandas and then you can use it for analytics as well that's the very basic use of python pandas that I can think of so right now that we know how pandas actually is in Python Programming and what it is used for let's go ahead and take a look at the very integral part of pandas that is data frames and CDs we'll wait for this to install guys meanwhile I'll just tell you what are data frames and series guys so data frame is a two dimensional and the size of the data frame is mutable potentially heterogeneous data or we can call it heterogeneous table data so the data structure which is data frame also contain labeled axis which is rows and columns and arithmetic operations align on both rows and column labels it can be Tau as a dictionary like container for series objects now what exactly is a series guys so a series or a panda series is a one-dimensional labeled array capable of holding data of any type which is integer can be string float python objects Etc and the access labels are collectively called index and Panda series is nothing but a column in an Excel sheet so let's just take a look at a few examples and then you'll be able to understand this better what I'm talking about like pandas and series what exactly they are we'll be working on a example now so I'll just take it to jupyter notebook guys so we have already created a jupyter notebook if you are not familiar with jupyter notebook guys we have a full tutorial on how to use your notebook all the cheat sheet and everything so you can just refer to those in our Eda YouTube channel so since I have already installed python pandas I'm just going to import pandas as PD run this I'm not going to face any problem running this command because I have already installed pandas okay so what this command will do over here it's going to install the package the latest release of pandas for me I think that's it's not running okay we have our first statement over here and we have successfully imported pandas as PD so I'm using the alas as PD I hope you guys know what alas is so let's just say I'm importing this Library so alas is going to be this one that is PD so for importing I'll tell you why I'm using this now if I want to create a data frame I'll just use DF as my variable name for data frame so I'm going to use the Alias when I type tab over here okay I'll just so this is how I can use my alas to create a data frame so this is just to tell you how you use an alas now I'm going to show you how you can create a series and a data frame so first I'm going to okay I'll have to import nump as well because I'm going to use it to create a null value all right so DF is equal to I'm going to make a series so I'll just write it as s all right series and I'm going to pass a list of values let's say 1 2 3 4 5 6 and I'm going to use my Naya Now to create a null value like 8 9 and one more value let's say 10 so it's going to create a series now when I print s so we have a series which has indexes which are already there and all these values that I passed inside a list so this is how we create a series in Python guys using pandas after this I'm going to tell you how you create a data frame so for that also I'm going to tell you how you create a data frame using a dictionary object and how you can create a data frame using series as well so now what we are going to do is we are going to create a data frame by passing a numpy array with a daytime index and label columns so I'll take one variable let's say date or dates I'll just type it as D and I'm going to take PD dot so we're going to take the date range and after this I'm going to pass a few value let's say 2020 and I'm going to pass values like we're in the month of March so I'll just write it at March and after this I'm going to take periods which is equal to let's say 10 so this is my date range guys okay I have an invalid syntax right should work fine now so when I print D over here so I have all these values in our date range format after this what I'm going to do is I'm going to take one data frame which I'm going to take as DF for obvious reasons to make it clearer and I'm going to take data frame and inside this I'm going to pass a few values so first of all I'm going to take a few random values so I'm going to use np. random dot random number and inside this I'm going to pass 10 let's say four now I'm going to get the index values as D and I'm going to have to pass a few more values which is columns so I'll pass the columns as a list and I'm going to take let's say four columns so I'm just going to take okay wait a minute a a b c d all right do we have any errors no so now I'm going to print my data frame so I have a data frame guys which I have created using you know passing an umpi array and I have a DAT time index with labeled columns which are a b c and d this is my index guys and I have all these random values using NPR so this is how you create a data frame guys just a simple example and I'm going to show you how you can create a data frame by passing a dictionary of objects that can be you know converted into a series also so I'll take let's say again DF is equal to pd. data frame and I'm going to pass a dictionary over here now so I'm going to take a few values first of all so first value is let's say a and after this I have to pass something right okay I'm going to write let's say a list of one 2 3 and four after this my next value is going to be let's say B and I'm going to pass a time stamp let's say and for time stamp I'm going to use the same I have used over here 2020 03 01 I'll use the right and after this I'm going to pass one more value let's say C and I'm going to use a series now a series object and inside this I'm going to pass one and the index is going to be let's say range all right index is equal to a list with a range of four because we have only four values over here we don't want any null values and after this I have to type in the data type as well the data type of the series guys so for that I use dtype is equal to let's say float 32 all right after this I provide my next value which is D now for D I'm going to use uh an nump array and for this I'm going to pass a value let's say not three let's say five multiplied by four and let's take the d type is equal to integer 32 yes all right now I take my final value which is going to be e and inside this I'm going to pass uh a data frame or we're going to use the categorical object guys we're going to talk about this later on in the session so don't worry I'm just showing you how you can create a data frame using all these objects that we we have at our disposal guys instead of test and train we can just call it as true or false doesn't matter we're taking categorical object so it has to be either true or false or it can be zero or one but it has to be decisive in a way that there are only two values so I take another value and for this let's just say I give the value Ed all right so our dictionary is done over here so we have created our data frame guys there's no error now when I print this so we have our data frame guys so A B C D E and F so we have all these values using different data types or we can call it objects as well so for that also we can check the data frame and we just write D types and it's going to give us all the data types that we have so we have date time stamps over here integer float integer category and an object because I have used a string over here that's why it is giving us an object but in the new release that is python 1.0.0 it's not going to be an object it's going to show you it is a string so don't worry guys and we have already made a video on python pandas 1.0.0 with all the features that have come with the new release the new stable release released last month you can check that out as well to check for the new features that we have come across so now that we have done this let us take a look at the next topic that we have which is how to view data so viewing data is basically you know how you actually look at the data or how you going to look at the data using Panda's Library so we'll just jump right to jupyter notebook guys and I'll tell you what kind of functions are all those things you have at our Bay that we can use to view our data so we'll do one thing first of all we have data frames already that we have over here so I'll do one thing I'll just change it to df1 so that we have different data types or I'm sorry different data frames guys I'll run this this and this as well when I check for DF do D types should be different guys because we have already made a data frame using that all right it's not so I'll run all the wait so I'll do one thing I'll restart and run all the cells so that we have two different data frames so first of all the first very basic thing that you can do for your data frame is to use I'll just tell you guys you write DF do head so what this function is going to do is give you the first five values inside your data frame or the first five rows and similarly for the last rows you can use the tail method so this is how you get the first and last values inside your data frame so it's going to display all the five values that you have at your beginning and the end of your data set after this we have DF do index so what this will do is it will give you all the values from your index and similarly we have DF do columns which is going to give you all the columns from your data frame so this is how you view your data guys and and then we have data frame. 2 numpy which is going to give you a nump representation of the data so I'll just tell you how you can do that so I'll just write DF do2 numpy wait a second guys yes so I'm going to create a nump array using this this I actually created an npire array and for dfr data frame of all floating Point values data frame. to nump is actually fast and does not require copying the dat data so it is a very best deal for we have and then okay I'll just remove this I don't want this and then we have data frame. describe which is going to give you somewhat like this which is going to give you the count the mean the standard deviation minimum 25% 50% 70% and maximum so these all values using the describe you can have which is going to give you an idea or a perspective of how your data actually is and what kind of calculations are already there that you can think of then we have sorting by an axis we can sort our data using an axis so for that you have to just write okay I'll just show you guys you just have to write DF do sort by index and inside this you you have to give the value of the axis I'll just give one and then let's say ascending you want it to be ascending no I'll just write it as false so it has given me the data frame by sorting the index similarly I can sort it by The Columns as well or I'm sorry sorted by values so I'll write as values and I want to give the value as say values and I'm going to give the value as let's say by I want to sorted by C it has sorted the values depending on C so this is how you sort your data frame guys and now that we know how we can actually look at our data I'm going to tell you how you select particular values inside your data guys I'm going to show you how you select a single column from your data frame so we'll write DF it's very simple guys to get a value from your data frame using only a single column you can write a or let's say C it's going to give all the values from C over here it has actually given you frequency it has given you name data type as well so this is how you actually get a single column from your data frame now let me show you how you can slice the rows as well so for that that we going to use the slicing if you have actually worked on list comprehension so we have slicing is the data over here we are going to follow the same principle here as well so I'll just write DF now I want my first starting from my first value to third value so it has given me only three rows starting from the zeroth row and it has given me third row and it does not actually include the third row because it starts from zero so I'm giving three values if I write six over here I'm going to at the sixth value but it's not going to be at the sixth row because the first row over here is going to be zero row I hope you understand this guys so I have shown you how you can slice your data you know to get particular number of rows now let me tell you how you can uh select the data using the labels guys so for that you have to use DF dot there has to be location that Lo and inside this you're going to pass the values by labels guys all right so let's pass D that is n0 so let's see what the output is guys all right so we have got all the values using the label that is D here which we have passed over there in our previous section where we have declared the data frame and I'm sure this is visible to you guys now the next thing is uh selecting data on a multiaxis by label so what we'll do is we'll write df.loc and after this we write hyphen and we going to create let's say a and we're going to pass C right so it's going to give me the values accordingly J pass over here so instead of a I can write b or I can write D so this is how you can select multiaxis using labels guys and I have written this over here so I can just write let's say 0 to three let's see what happens oh we have an error guys you cannot do this so we'll now move on to the next topic that we have is showing label slicing both ends are actually included so how do we do that so instead of this we can just let's say copy this paste it over here and copy this B over here remove this so this is just to show you guys how you can work with it I'm very certain that the data that you work on is not going to be like this it's going to be very complicated so this is just to give you a perspective as a beginner how you can work with pandas now I'm going to tell you how you can reduce the dimensions of the returned object as well so for that you just type one thing guys remove all this get just one value and this is going to give you the column number at over there and this is how you get the values inside a data frame so moving on let's say we want to get a Scala value so for that we just write okay let's say d0 right let's see if it works we're getting the same values guys only from the zero throw for getting the fast access to a scaler you can just write as uh DF instead of Lo we can just write at okay we have an error guys I'll just remove this and let's see if it works yes so I'm getting the exact value at the zero throw at the column number c now I'm going to tell you how you select a value from using the position inside your data frame so for that we use DF do ioc all right so let's say three okay so I'm getting the all the values from third column and similarly we can slice the data you know you can just get it like three to five right and we can add more values to this like 0o to two so this is how you select the values from your data frame guys now we have Boolean indexing as well inside our data frames so I'll just tell you quickly what it is so for that you just write DF now I'm going to check if DF column number a let's say this is interesting guys so it has given me all the values inside a where a is greater than zero if I write it let's say two I have no values because none of the values are greater than two so this is how you can get the Boolean indexing this is actually important when you applying functions to your data frame guys so Moving On Let's uh take a look at another method which is is in method I'll just tell you how it works guys so it's basically used to check if the particular value inside your data frame is in there or not now there's one more thing we can set new values inside our data frame we can set a new column which automatically aligns the data by the indexes so for that we can make the series and we can set the values by label we can set the values by position and we can set the assigning with a numpy array as well and the result of the settings will actually align with the data frame where a new operation with the setting can be followed where you can simply align the data frame with the existing data frame now let's go ahead and take a look at the next topic that we have which is handling the missing data inside your data frame so let's jump right to it guys we'll go to jupyter notebook and we'll work with our Miss missing values now so pandas primar use the value np.nan to represent missing data It Is by default not included in the competitions and we're going to see the missing value right now so first of all you have to reindex I mean you have to do reindexing which is going to allow you to change add delete the index on a specified axis so which is going to return a copy of the data as well all right I'll just take df2 is equal to DF do reindex so this how I'm going to do the reindexing guys so index is equal to let's say d 0 to 4 yes I'm going to get the columns after the indexing which is going to be equal to list of DF do columns and I'm going to add one more column which is going to be let's say e all right now what I'm going to do is I'll do looc I'm going to check a few values so instead of dates I've have taken D guys so D of z and d of 1 at e equal to 1 now let's check what is df2 as we have two null values over here so this is how I'm going to show you how to handle missing values inside your data frame so we have done reindexing so first of all I'm going to check for null values so we have true over here and we can get the count as well is null and we count these null values all right right now we're going to drop a few columns so we're going to drop the na that is the na values so as you can see from our data frame all the values that had null values are dropped actually the whole column has been dropped or we can do one thing fill in the missing data guys just do one thing okay check df2 we have so we can do one thing we can fill the missing values and we're going to provide some value let's say value is equal to two right so we have actually filled the value with some of the value wherever there is a missing value we have given a value that is going to fill over there so this is how you get or you know check for all these uh missing values inside your data frame after this you can actually get a Boolean mask where values are na n which is null so for that you do PD dot okay is na df2 so this is going to get you a Boolean mask which I've already told how you can check using DF do is null same thing but different processes to run this now let us move ahead and to the next topic that we have which is Panda's operations guys Panda operations are nothing but a few operations that you can apply on the data frame or any other pandas object so we have descriptive statistics that we can apply we can apply functions histogram is there and string Methods is also there so I'll tell you what histogramming is when we are talking about it so let's take it up to jupyter notebook again guys I'll tell you how you can actually work with Panda's operations so first of all I'm going to tell you a basic operation which is that is a descriptive statistic so it's going to give us all the mean values similarly we can get one value like DF do mean provide one value over there so this is how you get it guys or we can write it as two right all right so this is how operating with objects that have different dimensionality and need alignment in addition Pand automatically broadcast along the specified Dimension so for that let's make a series I'll tell you just how it works PD do series uh give it a few value let's say 1 2 3 NP do Nan 5 I'm sorry four five and then give the index value index is going to be D Ates and let's just shift all this two places right we have made an error guys length of pass value is six and index implies 10 so we have to uh actually put more values so write 6 7 8 9 yes now when I print s over here we have all these values now we can do one more thing so we write it as DF do sub and we pass the S over here which is our series and we make an axis write it as index so we have operated with objects that have different dimensionality and needed alignment so in addition pandas actually helped us automatically broadcast the specified Dimension so now I'm going to talk about applying functions to the data so we already have data frame let's see what we have so we have this data frame guys so what I'm going to do is I'm going to apply a few functions so first of all I'm going to use the apply method over here and let's see what all do I have let's just check we have absolute absolute input all all close a Max a minimum angle any append all these functions that I can apply on this so let's say we have commum let's see what this does guys all right so this is how it works now let's apply a few more functions guys so I'll write Lambda X so we are talking about Lambda functions here I'm sure most of you must be aware of the Lambda functions that we have in Python if you don't have any prior knowledge on Lambda function guys there is a full tutorial on how Lambda function Works in Python guys so this is how we applied Lambda function to get the subtraction between the X Max and X minimum right for all the columns we have subtraction between all these values so this is how we apply functions to our data guys now I'm going to talk about histogramming so histogram is a representation of the distribution of data so this function we have which is map li. pipot doist on each Series in the data frame resulting in one histogram per column so what we do is uh we'll make a series and it's going to give us value counts for histogramming how do we do that actually so we can just write as s do value counts right let's see if it works right we have one value for each so this is how get or do the histogramming with data guys now I'm going to talk about the string Methods so series actually is equiped of string processing methods in the string attribute that actually makes it easy to operate on each element of the array so let's just move ahead with the example so I'll make a series guys PD dot let's say series and inside this I'm going to pass a few string values so I'm going to start with ED rea write python next let's write Jupiter give a few null values as well to make it a little or slightly different from perfect all right so give it a few value let's say football and looking at the current scenario let's write w so we have a series over here guys now what we'll do we'll take or use the string Methods so I'm going to make it all to lower which are already lower so I'll make it to Upper guys upper letter words so everything we have changed using the string Methods inside our C uh pandas series these are all the operations that you can perform on pandas guys so let's move ahead to the next topic that we have which is merging so in merging we are basically going to merge two data frames together so we have two functions which is concat and join so concat Panda's objects along particular axis with optional set logic along the other axis it can also add a layer of hierarchical indexing on the concatenation axis which may be useful if the labels are the same or overlapping on the past axis number so let's take a look at the example for this so we have our data frame I'm going to use PD do data frame give it a few values let's say NP do random do random number from let's say 10 and four all right so we have all these values now what I'll do is I'll break it into the pieces so how to do that so let's say I write it as DF 2 is equal to DF from first to third row the next one is DF from third to let's say seventh row and then we have DF from 7th to the end of the data so this is how I am going to break into the pieces now when I write it as df2 we have three data sets now I'm going to right now I'm going to use the concatenation function using concat all right I'm going to concatenate df2 so this is how I've have concatenated the missing pieces not the missing pieces the several other pieces together using the pandas concat function now let's talk about the join function that we have so for this basically I'm going to tell you how you can uh do the left and right join so for the left join I'll write left is equal to PD do data frame and let's see first value let's say a give it a few values let's say 1 2 and then say B we have 3 4 all right we make the for the right as well so I'm just going to copy all this paste it over here okay I'm just going to write it as D this is going to be let's say C and change a few values so we have left and right and I'm going to just type left and then we get the output for right so I'm going to join all these two using the join function so for that I'm going to use the merge function and inside this let's say left we have right and on is equal to I'm going to join it at uh let's say a let's see how it works okay it doesn't so we have a key error which is a so I'll change it to a guys so we actually join using the merge function over here and another example that I can think of is uh let's say you have left and right we can change the values differently and we can just group them together now what I'm going to talk about is grouping guys so how you do the grouping of different data inside your data frame using pandas first of all for that you have to split the data into groups based on some criteria that you have and then after that you apply some functions on them and then later on you can combine the results into a data structure so first of all you have to get a data frame guys so let's see we have a data frame over here so we can actually group it by let's say right I'll just write Group by say a and just going to sum right we have an error guys so we have a key error that is a because we don't have over there so I'll write it as let's say two all right we don't have two as well we have a key error we don't have two over there because it's not an string value it's an integer value so we have our values right over here so we have group the data using the column that is number two now we can actually group the data by multiple columns and form a hierarchial index so for that I'll just copy this guys or over here only I'll just do one thing two and let's say three so this is how you actually combine multiple columns to form a hierarchial index but here again we don't have actually categorical values for these columns if we had we'd be able to do that that you know if we had like let's say true and false it is going to have created a hierarchial index which will have different values for True columns and for false columns so that's how you do the grouping or that's where you actually need grouping using pandas now let's move ahead and take a look at the next topic that we have okay we have already talked about operations then I have talked about merge where we have talked about concat and join and then I've talked about splitting the data how you apply the functions and then combining the results together merging we have talked about grouping now I'm going to talk about stack and pivot table so what exactly is a stack guys in pandas I'm sure most of you must have heard about some other definitions of Stack so I'm going to tell you about this with perspective of Panda's Library here so the stack function is used to stack the Pres Skype levels from columns to index and it returns a reshape data frame or a series having a multi-level index with one or more new innermost levels compared to the current data frame so you're going to understand this with an example so I'm just going to take uh let's say my topple is equal to I'm just going to take a list and inside this let's see we have values and we provide or we get two lists over here a list inside list guys so let's take a few values take one 2 3 4 five and then again let's say 6 7 8 9 and 10 or we'll add few more values guys let's say 11 12 and 13 over here I will add few more values let's say 17 18 and 19 so we have our to I'm going to create one more variable index so we'll have multi-index so from topples and inside this I'm going to pass my tle pass the names as well so the names are going to be let's say first and second so we have our index now I'm going to create the data frame guys so for this we write PD do data frame and inside this I'm going to pass a few values NP do let's say random dot random number so we have eight values and two columns all right so and the index is going to be index and columns is equal to A and B all right we have an attribute ER guys I made a mistake so there's no error now and I'm going to make one more data frame so inside this I'm going to pass this value now I check my data frame so I have reshaped my data frame with this first and second and all these values now we're going to talk about the stack method guys so which compresses a level in the data frames columns so we're going to do one thing we're just going to do df2 dot stack right this an attribute Arrow so this is how we stack or compress a level in the data frames column guys and with a stack data frame or series having a multi-index as the index inverse operation of Stack is unstacked by which default it's going to unstack whatever you have done with using stack so we'll do df2 do unstack and this is how you unstack now we'll talk about the pivot tables guys oh wait let's put it inside a variable let's say a all right so this is how you unstack guys we getting different values there now we're going to talk about the pivot tables that we have in pandas so it is nothing but the levels in the PIV table will be stored in multi-index objects on the index are columns of the result data frame so we'll take a look at one example guys which is going to be pretty clear so we'll take DF again PD do data frame and inside this we're going to take a few values or we're going to take a list inside a dictionary so let's go with a few values say one let's say a right we write a b b c d we'll moove a few values over here multiply to three now we take another value which is going to be B and for this we're going to take a list again so let's see multiply to four take another value let's say C now and inside this uh we're going to pass a few values again so we're going to pass Six values so let's write p p and p q q and Q multiply to two because we want the number to be 12 and now D is going to be NP do random dot random number and the number we want is 12 and we take one more value e and we take same values for this as well random. random number 12 no errors I guess so we have an invalid syntax guys so we forgot to add a comma over here and here and here right so I think it should work fine now without any errors so we have printed the data frame over here guys so this is how it looks now we can produce pivot tables from this data very easily guys the very reason of creating this data frame was to get the pivot tables now what I'll do is I'll just write PD dot pivot table and I'm going to pass DF over here right DF and the values is going to be let's say d and index is equal to a and b and columns let's say is equal to C so this is how you create a pivot table guys now that we done with pivot tables uh let me talk about the next topic that we have which is time series and categoricals so we have done reshaping merging grouping as well now we going talk about the time series and categoricals so pandas has simple powerful and efficient functionality for performing resampling operations during a frequency conversion which is for example converting secondly uh data into five minutely data and this is extremely common in but not limited to financial applications so we're going to take a look at a few examples and for categorical data data that you collect can be either categorical or numerical so numbers often don't make sense unless you assign meaning to those numbers so for categorical data is when numbers are collected in groups or categories and categorical data is also the data that is collected in an either or yes or no situation for example we have zero or one we have true or false so that's going to be the category over there so let's take a look at a few examples to understand this guys the time series and categoricals so we'll take a look at a few examples for time series so first of all what I'm going to do is I am going to make a Time series guys so first of all what we'll do is we'll convert the data into five minutely data so it's very common guys so I'll just take the range first of all so we'll take the dates all right I made a mistake just cut this so I'll just run this again we'll get one more column and now what I'll do is I'll make one variable using the pd. DAT range yes and I'm going to provide the range as let's say 2020 01 or 03 01 we want the periods is equal to 100 and let's say frequency is equal to S all right let's print dates okay so we have dates over here just remove this for now so now we take one more variable let's say TS is equal to PD do I'm going to take a Time series and inside this I'm going to use NP do random and random integer which is going to be 0 to 500 and the length is of dates and then we have index is equal to dates so we have okay we have number attribute no random randant all right so we have an unsupported date time so I'm going to have to change this over here so we'll write it as let's say 3 1 or write 33 22 20 right so let's see if it works now doesn't so we getting the unsupported D type okay so we just mention a few more stuff over here so we write it as 0 0 0 0 so I think I figured out the problem over here so I've changed this to the this format and I'm going to add one more parenthesis here and remove this one and now I'm not getting any errors guys so I'll run it again right so now what I have to do is I will have to make a few changes to that TS which is time series I've made so I'll write it as ts. resample and I want to make it to 5 minutes right and I have to sum this so no errors there guys so now I'm going to do the time zone representation so for that I'm going to get one more okay just copy this guys this little change to this only so I'll just copy this paste it over here and here I have to make a few changes that is 0 0 rest everything is going to be fine and we change this to five all right so now we make another time samp and we just use PD do CV and inside this I'm going to use NP do random dot random number and the random number is going to be the length of dates and dates over here all right so now when I print timestamp I'm getting the output as somewh like this so this is how you get the time zone representation guys you're getting the date the time and everything and the data type is float 64 and then you can also get the UTC as well so for that I'll just write TS UTC and we just write timestamp dot TZ localize and we get UTC right it's UTC now I'm going to print this so we get the UTC as well so this is how you create the time zone representation and now after this I want to show you how we can convert to another time zone so for that we don't have to do anything we just write TS dot all right TS UTC do TZ which is time zone convert and we write us eastone so we have have converted into the US time zone and converting between time span representations also we can do that let's just say that's your exercise so you have to convert between the time spans representations so that you'll be able to understand this better for that you have you don't have to do anything you just have to take the date range the period is going to be five and then you write the frequency instead of s you write it as M the rest all is going to be the same guys okay I'll just do it here as well just copy this paste it over here instead of s it's going to be M we have to remove this all right now I make the time stamp which is going to be the same guys now print the time stamp this is how we convert or after this we take one more variable let's say PS and I'm going to to period all right now when I print PS over here this is the output I get so the frequency is m i have changed the frequency and now I can just convert it to time stamp guys let's see where it is this is how you create it into a time stamp now converting between period and time stamp enable some convenient arithmetic function to be used so for that we have period range and all those things that you can add now moving on to the next topic that is categoricals so for that let me just take one more data frame guys so I'll write it as DF is equal to pd. data frame now I'm going to take a few values inside this uh dictionary the first is going to be let's say ID and now I'm going to pass a list with a few values valuse precisely six so I'm going to use six values over here now my next topic is or the next key is let's say raw grade or just grade we'll write so we're going to write a b c let's say again the guy is getting B and this one's getting a so we have five and one more let's say one guy has failed so we have our data frame let's print this okay so we have our data frame what I'm going to do is I am going to get the grade right so we will get the grade is equal to DF grade now this is going to be the category I'm going to make all right now you print DF grade so we have grades like a BC b a e now I'm going to rename the categories to more meaningful names so what I'll do is I'll make the change over here only so I'll write cat dot categories and I'm going to write the categories as let's say good instead of pass fair I'm just going to write very good and then there is excellent all right so we have very good very bad and excellent all right we have an error guys there's something we have done so we have new categories needs to have the same number of items as the old categories so how many categories do we have over there I just copy this so we have categories four so we're going to have to make four categories guys so I'm just going to add good as well there shouldn't be any arrows now what I'll do is I'll just set the categories so I'm going to set the categories now let's say here I'm going to have to give uh six values guys s right very good let's say bad very bad medium good very good any sort of categories I have to give over here so I'll just write very bad good and let's say medium all right so after this I have set the categor as well now I'll just write DF great okay so we have good very bad then we have very good very bad and good and then we have n because we have not given any other category for that and we have the five objects so we are getting very good bad very bad good medium so that's how you use the categoricals in pandas guys now I'm going to talk about plot using pandas so that's going to be very simple guys for that I'm going to have to import one more library that is matplot lib dot pip plot I want to use it as PLT all right same thing goes with this as well all right this is going to be pip plot all right so there should not be any errors now okay so I will close all now I'm going to make one series guys and inside the series I'm going to provide a few random values like NP do random. random number until let say th000 or let's say 500 or yes 500 and the index is equal to PD do date range I'm going to take the date range as 1 3 2020 and let's take the periods is equal to 1,000 wait we have take it as 500 because we have 500 values over there I hope no errors yes now I'm going to take the time stamp and we're going to get the come sum all right now ts. plot so we have a plot over here using pandas guys this is how we have created one uh series using the random numbers from numpy library and using the P plot we have plotted a graph for random values which we have taken from 0 to 500 and the random range as well so this is how you take or get a plot using pandas guys now last but not the least we have another topic which is reading and writing to files so inside this I'm going to show you how you can read from a file and how you can actually create a file over there so we have our data frame guys or we have our TS this this is RTS guys so I can just you know convert it to a CSV file guys make it or give it a file name as let's say ts. CSV and it's going to save the ts. CSV file somewhere in my directory and similarly I can read from a CSV file so for that I can just write PD do read CSV and I'm going to have to give the file location for that so I'll just copy one file location from one of my data sets so this is one data set that I have so I'm going to check the properties or wait I'm going to copy this paste it over here okay we have an uni code error so I'll just write R over here and I'm able to read from the CSV file guys look at this instead of CSV I can write Excel and it's going to create a Excel file or a CSV file it's going to read from so that's how we actually read and write from files like a CSV file which is a comma separated file basically or we read from a file now now that we have come to the end of the session guys I hope everything discussed in the session is clear to you guys you can freely drop your doubts in the comment section to reach out to Eda Community to post your queries thank you and happy learning that
Original Description
🔥𝐄𝐝𝐮𝐫𝐞𝐤𝐚 𝐏𝐲𝐭𝐡𝐨𝐧 𝐂𝐞𝐫𝐭𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐂𝐨𝐮𝐫𝐬𝐞 : https://www.edureka.co/python-programming-certification-training (Use code "𝐘𝐎𝐔𝐓𝐔𝐁𝐄𝟐𝟎")
This Edureka video on 'How to use Pandas in Python' will help you get started with Python Pandas Library for various applications including Data analysis.
🔴 Subscribe to our channel to get video updates. Hit the subscribe button above: https://goo.gl/6ohpTV
📝Feel free to share your comments below.📝
🔴 𝐄𝐝𝐮𝐫𝐞𝐤𝐚 𝐎𝐧𝐥𝐢𝐧𝐞 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐚𝐧𝐝 𝐂𝐞𝐫𝐭𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬
🔵 DevOps Online Training: http://bit.ly/3VkBRUT
🌕 AWS Online Training: http://bit.ly/3ADYwDY
🔵 React Online Training: http://bit.ly/3Vc4yDw
🌕 Tableau Online Training: http://bit.ly/3guTe6J
🔵 Power BI Online Training: http://bit.ly/3VntjMY
🌕 Selenium Online Training: http://bit.ly/3EVDtis
🔵 PMP Online Training: http://bit.ly/3XugO44
🌕 Salesforce Online Training: http://bit.ly/3OsAXDH
🔵 Cybersecurity Online Training: http://bit.ly/3tXgw8t
🌕 Java Online Training: http://bit.ly/3tRxghg
🔵 Big Data Online Training: http://bit.ly/3EvUqP5
🌕 RPA Online Training: http://bit.ly/3GFHKYB
🔵 Python Online Training: http://bit.ly/3Oubt8M
🌕 Azure Online Training: http://bit.ly/3i4P85F
🔵 GCP Online Training: http://bit.ly/3VkCzS3
🌕 Microservices Online Training: http://bit.ly/3gxYqqv
🔵 Data Science Online Training: http://bit.ly/3V3nLrc
🌕 CEHv12 Online Training: http://bit.ly/3Vhq8Hj
🔵 Angular Online Training: http://bit.ly/3EYcCTe
🔴 𝐄𝐝𝐮𝐫𝐞𝐤𝐚 𝐑𝐨𝐥𝐞-𝐁𝐚𝐬𝐞𝐝 𝐂𝐨𝐮𝐫𝐬𝐞𝐬
🔵 DevOps Engineer Masters Program: http://bit.ly/3Oud9PC
🌕 Cloud Architect Masters Program: http://bit.ly/3OvueZy
🔵 Data Scientist Masters Program: http://bit.ly/3tUAOiT
🌕 Big Data Architect Masters Program: http://bit.ly/3tTWT0V
🔵 Machine Learning Engineer Masters Program: http://bit.ly/3AEq4c4
🌕 Business Intelligence Masters Program: http://bit.ly/3UZPqJz
🔵 Python Developer Masters Program: http://bit.ly/3EV
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from edureka! · edureka! · 13 of 60
1
2
3
4
5
6
7
8
9
10
11
12
▶
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
ChatGPT Not Working - 4 Fixes | How To Fix ChatGPT Not Working | Why Is ChatGPT Not Working |Edureka
edureka!
Advanced Java script Tutorial | JavaScript Training | JavaScript Programming | Edureka Rewind
edureka!
Java script interview question and answers | Java script training | Edureka Rewind
edureka!
OpenAI API Tutorial using Python | How to use OpenAI GPT-3 API - Ada Babbage Curie Davinci | Edureka
edureka!
What is Unsupervised Learning ? | Unsupervised Learning Algorithms| Machine Learning | Edureka
edureka!
Top 10 Applications of Machine Learning in 2023 | Machine Learning Training | Edureka Rewind - 7
edureka!
Machine Learning Engineer Career Path in 2023 | Machine Learning Tutorial | Edureka Rewind - 6
edureka!
10 Must Have Machine Learning Engineer Skills That Will Get You Hired | Edureka Rewind - 7
edureka!
Data Structures in Python | Data Structures and Algorithms in Python | Edureka | Python Live - 5
edureka!
Python Lists | List in Python | Python Training | Edureka Rewind
edureka!
Predictive Analysis Using Python | Learn to Build Predictive Models | Python Training | Edureka
edureka!
Machine Learning Tutorial | Machine Learning Algorithm | Machine Learning Engineer Program | Edureka
edureka!
How to use Pandas in Python | Python Pandas Tutorial | Python Tutorial | Edureka Rewind
edureka!
Parameters in Tableau | Tableau Parameters Examples | Tableau Tutorial | Edureka Rewind
edureka!
Top 10 Reasons to Learn Tableau in 2023 | Tableau Certification | Tableau | Edureka Rewind
edureka!
Tableau Developer Roles & Responsibilities | Become A Tableau Developer | Tableau | Edureka Rewind
edureka!
Deep Learning With Python | Deep Learning Tutorial For Beginners | Edureka Rewind
edureka!
Realtime Object Detection | Object Detection with TensorFlow | Edureka | Deep Learning Rewind - 2
edureka!
Top 20 Tableau Tips and Tricks in 20 Minutes | Tableau Tutorial | Tableau Training | Edureka Rewind
edureka!
Climate Change Prediction using Time Series | Python Projects | Edureka | DS Rewind - 5
edureka!
ReactJS Installation Tutorial | ReactJS Installation On Windows | ReactJS Tutorial | Edureka Rewind
edureka!
Phases in Cybersecurity | Cybersecurity Training | Edureka | Cybersecurity Rewind - 2
edureka!
What Is React | ReactJS Tutorial for Beginners | ReactJS Training | Edureka Rewind
edureka!
Cybersecurity Frameworks Tutorial | Cybersecurity Training | Edureka | Cybersecurity Rewind- 2
edureka!
React vs Angular 4 | Angular 2 vs React | React & Angular | ReactJS Training | Edureka Rewind - 5
edureka!
ReactJS Components Life-Cycle Tutorial | React Tutorial for Beginners | Edureka Rewind
edureka!
Ethical Hacking using Kali Linux | Ethical Hacking Tutorial | Edureka | Cybersecurity Rewind - 3
edureka!
Types Of Artificial Intelligence | Artificial Intelligence Explained | What is AI? | Edureka
edureka!
Top 10 Applications Of Artificial Intelligence in 2023 | Artificial Intelligence| Edureka Rewind
edureka!
The Future of AI | How will Artificial Intelligence Change the World in 2023? | Edureka Rewind
edureka!
What is Artificial Intelligence | Artificial Intelligence Tutorial For Beginners | Edureka Rewind
edureka!
Google Cloud IAM | Identity & Access Management on GCP | Edureka | GCP Rewind - 5
edureka!
Google Cloud AI Platform Tutorial | Google Cloud AI Platform | GCP Training | Edureka Rewind
edureka!
Projects in Google Cloud Platform | GCP Project Structure | GCP Training | Edureka Rewind
edureka!
How to Become a Data Scientist | Data Scientist Skills | Data Science Training | Edureka Rewind - 3
edureka!
Agglomerative and Divisive Hierarchical Clustering Explained | Data Science Training | Edureka Live
edureka!
Climate Change Prediction using Time Series | Python Projects | Edureka | DS Rewind - 5
edureka!
Data Science Project - Covid-19 Data Analysis | Python Training | Edureka | DS Rewind - 6
edureka!
What is Honeycode? | Introduction to Honeycode | Edureka
edureka!
Difference between Amazon AWS and Google Cloud | GCP Training Google Cloud | Edureka Live
edureka!
DevOps Lifecycle | Introduction To DevOps | DevOps Tools | What is DevOps? | Edureka Rewind
edureka!
Introduction to DevOps | DevOps Tutorial for Beginners | DevOps Tools | DevOps | Edureka Rewind
edureka!
How to Create Login System using Python | Python Programming Tutorial | Edureka Rewind
edureka!
Python Developer | How to become Python Developer | Python Tutorial | Edureka Rewind
edureka!
How to become a Data Engineer | Complete Roadmap to become a Data Engineer| Data Engineer | Edureka
edureka!
Azure Data Engineer Certification [DP 203] | How to Become Azure Data Engineer [2023] | Edureka
edureka!
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Program | Edureka Rewind
edureka!
DevOps Engineer day-to-day Activities | DevOps Engineer Responsibilities | Edureka Rewind
edureka!
How to Become a DevOps Engineer? | DevOps Engineer Roadmap | Edureka | DevOps Rewind
edureka!
How to Become a Data Engineer? | Data Engineering Training | Edureka
edureka!
How To Become A Big Data Engineer? | Big Data Engineer Roadmap | Edureka Rewind
edureka!
Python Integration for Power BI and Predictive Analytics | Power BI Training | Edureka
edureka!
Power BI KPI Indicators Tutorial | Custom Visuals In Power BI | Power BI Training | Edureka Rewind
edureka!
Apache HBase Tutorial For Beginners | What is Apache HBase? | Big Data Training | Edureka Rewind
edureka!
Big Data Hadoop Tutorial For Beginners | Hadoop Training | Big Data Tutorial | Edureka Rewind
edureka!
Big Data Analytics | Big Data Analytics Use-Cases | Big Data Tutorial | Edureka Rewind
edureka!
What Is Power BI? | Introduction To Microsoft Power BI | Power BI Training | Edureka Rewind
edureka!
Triggers in Salesforce | Salesforce Apex Triggers | Salesforce Tutorial | Edureka Rewind
edureka!
How To Become A Salesforce Developer | Salesforce For Beginners| Salesforce Training Edureka Rewind
edureka!
Java ArrayList Tutorial | Java ArrayList Examples | Java Tutorial | Edureka Rewind
edureka!
Related AI Lessons
⚡
⚡
⚡
⚡
Müşteri Değerini Anlamak: RFM, CLTV ve Tahmine Dayalı CRM Analitiği
Medium · Machine Learning
Müşteri Değerini Anlamak: RFM, CLTV ve Tahmine Dayalı CRM Analitiği
Medium · Data Science
Müşteri Değerini Anlamak: RFM, CLTV ve Tahmine Dayalı CRM Analitiği
Medium · Python
Surviving the Data Science Behavioral Interview
Towards Data Science
🎓
Tutor Explanation
DeepCamp AI