Data Analytics With Python 2026 | Python Data Analytics Tutorial for Beginners | Simplilearn
Key Takeaways
This video teaches data analytics with Python, Python data analytics tutorial for beginners
Full Transcript
Most people look into data and see numbers. Data analysts look into the same data and see the answers. That different isn't talent, it's skills. And that skills can be learned. Welcome to this course on Python for data analysts. Now, we know that every organization today is surrounded by data. Let it be sales figures, customer behavior, performance metrics, and even reports. But raw data on its own doesn't drive decisions. What creates impact is its ability to collect the right data, clean it, analyze it, and clearly explain what it means. And that's what exactly a data analyst does. In this course, we'll learn Python for data analytics, the most widely used tools for modern data analysis. More importantly, you'll learn how Python fits into the complete analytical workflow exactly the way it used to in real world projects. We move step by step from Python fundamentals to data cleaning, analysis, visualization, automation, helping you think and work like a professional data analyst. In this course, you'll learn how to set up a Python analytics environment, use Python fundamentals for data analysis, clean and prepare real world data sets, analyze data using NumPy and pandas, perform exploratory data analysis, which is a process of EDA, create clean business ready visualizations, work on realworld analytical projects, and finally, we automate reports and workflow using Python. Before we dive in, here's a quick question for you to answer. Which Python library is mainly used for data cleaning and analysis? Is it numpy, pandas, mattplot lib, or is it seabon? Do let me know your answers in the comment section below. Also, if these are the type of videos you're interested in, do like, share, and subscribe to simply learn. Now, let's get into the part of what a data analyst actually does. Many people hear the term data analyst, but the real work happens in a few clear steps. and this slide captures them perfectly. The first step is collecting data. In real projects, data does not come from just one place. A data analyst works with CSV files, Excel sheets, company data sets or sometimes data coming from websites and application. Using Python, we bring all this data together into one place so we can start working with it. Once the data is collected, you'll notice something very important. raw data is never clean or even inconsistent entries. So the next step is cleaning and preparing the data. This is where the data analyst spends a lot of time making sure the data is accurate and usable. After the data is cleaned, we move on to analysis and visualization. Here the goal is to understand what data is trying to tell us. We look for patterns, trends, and relationships. Using Python, we create simple and graph that makes the data easier to understand even for non-technical people. Finally comes the most important part which is presenting insights and automating reports. A data analyst job is not just to analyze data but to explain it in a way that helps business make decisions. Instead of repeating the same work every time, Python is also used to automate reports so insights can be generated quickly and consistently. So overall a data analyst collects the data, cleans it, analyzes and then turns it into meaningful insights that drives better decision. Python plays a key role in every step of this process. Now that we have understood what a data analyst does, the next obvious question is why Python? The first reason is ease of learning and readability. Python looks very close to normal English which makes it easier to read and also write even for beginners. Compared to older programming languages, Python requires much less code to perform the same task which saves time and reduces errors. The second big reason is Python's rich analytics ecosystem. Python has powerful libraries like numpy for fast mathematical operation, pandas for working with tables and data sets and also mattplot lab or even zone for creating visualization. With these libraries, complex data task become simple and efficient. Another important reason is flexibility and scalability. Python works well whether you are analyzing a small Excel file or handling a larger data set. It also helps automate repetitive tasks like daily data processing or weekly reports which is extremely useful in real world projects. And finally, Python has a strong industry adoption and career value. It is widely used across finance, e-commerce, healthcare, technology and also many other industry. Knowing Python opens doors to roles like data analyst, data and business analysts. So overall, Python is easy to learn, powerful, flexible and widely accepted in industries which makes it perfect choice for data analytics. Now let's compare Python with because many people ask do we really need Python when these tools already exist. Let's start with Python. Python is powerful because it supports automation and scripting. It can handle complex logic and work with large scale data. With Python, we can clean data, analyze it, visualize it, and even automate the entire workflow in one place. Next is Excel. Excel is great for quick analysis and simple visualization. It's very userfriendly and works well for small data sets. However, when the data becomes very large or logic becomes complex, Excel starts to struggle. Now, let's talk about SQL. SQL is mainly used for data extraction. It is excellent for querying data sets, filtering the data and joining the tables. But SQL alone cannot handle machine learning, advanced analytics or even automation. Finally, there is is very common in statistics and data visualization especially in research and academic work. But when it comes to building production ready applications or automating end-to-end workflows, R is less commonly used compared to Python. So in real world projects, Python often works along with Excel, SQL and sometimes Python acts like a central tool that connects everything making it the most versatile option for data analyst. Now let's take a look at where Python analytics is actually used in real world. Python is not limited to one industry. It is used across many domains wherever data is involved. First is business and marketing. Companies use Python to analyze customer behavior, track campaign performance, understand sales trends, and even predict future growth. Python help businesses make smarter decision based on data instead of guesses. Next is e-commerce and product based companies. Here, Python is used to analyze user activity, recommended products, optimize pricing, and improve customer experience. Almost every online platform relies on Python analytics to understand how user interact with their products. The third area is finance. Python is widely used for financial analysis, risk assessment, fraud detection and forecasting. Banks and financial industries rely on Python to process large volumes of data quickly and accurately. And finally we finally Python analytics plays a major role in health care, sports and education. In healthcare it helps analyze patient data and improve treatment outcomes. In sports it's used for performance analysis. In education, Python helps track student performance and learning trends. So no matter what the industry, business, finance, healthcare or education, Python analytics help turn data into insights that drive better decisions. Now let's talk about Python ecosystem for data analytics. One of the biggest reasons Python is so popular is not just the language itself but the powerful libraries which come with it. We'll start with is mainly used for numerical computation. It helps Python handle large amounts of numerical data efficiently and forms the foundation for many other data analytics libraries. Next is pandas which is one of the most important libraries for data analytics. Helps us work with data in the form of tables. Similar to Excel but more powerful. It is used for data cleaning, filtering, grouping and transforming of data. Once the data is ready, we need to visualize it and that's where Matt plot lib comes in. Mattplot lib is used to create basic charts like line graph, bar charts and pie charts helping us understand data visually. On the top of mattplot lib we have which is used for advanced and statistical visualizations makes charts more attractive and easier to interpret especially when working with patterns and comparisons for machine learning. Python provides clarn. The library is used for tasks like prediction, classification and clustering and it integrates smoothly with pandas and and finally we have Jupyter notebook or even Google collab. These tools are used to write code, run experiments and also visualize results step by step. These are very popular for learning analysis and sharing results with others. So overall Python ecosystem provides everything a data analyst needs from the data processing and visualization to machine learning and experimentation all in one place. So for the setup I have three free setups that you can create in your laptop and to start execution of Python for analytical part. The first one is absolutely free for you. So you just need a Google account and then just click on Google collab and just open the first link. Now here you can see all my other files are there. You can just click on create new notebook and you'll have a sheet to work on. Now here all the tools that I'm using is having line by line execution widget. Every line can be executed only by clicking on them. Now for Google Collab the first thing you have to change is the name. So give it any kind of name and as you can see we have other options of rename, moving, move to trash, save a copy and drive, open notebook, open new notebook, you can integrate git also here and also download and print. We have edit tools also. So you can select the cells, copy and paste and you can delete the execution already executed cells which will have the output in it. Now here there is a plus code option which basically enters a coding line. But if you want a clear text to be there. So you can just click on text and you can enter the heading and also here you can see you can either edit the cell or delete it and you have other options of the each cell over here. Now collab runs in a browser gives you a free Jupyterlike notebook and needs only a Google account. It is great for beginners because there is no installation and everything runs on Google servers. So let's just try an example. Let me just import a library. Import pandas and just run it. So as you can see it'll show the amount of seconds that if there is any output output also will be displayed. If you look at you'll have RAM and disk. So this is for if you want to perform high-end operational ones, you can change this runtime type. Change the runtime type the T4 GPU. This gives a more wider ranger of execution without any limitations. Now here are some disadvantages of using collab. We know that the execution process it's a little bit less compared to the Jupyter notebook or VS code. Second one is every time you open the notebook you need to upload the documents which was already there which you have worked on etc. Next the second way is VS code plus Jupiter. So for that you need to install VS code first. So just type in VS code and the first link you can just open it and here you will see an option of download it for Windows. You can click on that install the process. Just click on next next and you will be able to see VS code here. So let me just open VS code. So as you can see these are my previous previous projects. So you can just cancel it out and this is the execution after installing VS code you'll get like this. Now how do you execute this via Python Jupyter notebook? So as you can see there is extensions here. So just click on that type in Jupiter by Microsoft and there will be a option for you to install. So if you install it, you can have a Jupyter like notebook. Now before this you have to make sure the Python is present in your system. So for that you can just type in Python and the latest version whichever is it you can just download it. So go on to python.org and downloads. You can see or get standalone installer for Python. You can just click on this. The download will start and you can again install it similar to visual code. Now here you'll find an option of Python 3.14. So if you just click on that the installation process will start. But here is a major thing that people miss is to use admin privileges while installing py.exe and also there's the most important one which is add python.exe to path. Now this initiates Python all across your system so that you don't need to run in a particular folder to run the Python code. So if you just click on install now the installation will start. Since I already have Python I don't really need to do that. Now how do you confirm that Python is already installed? Just open Windows R and command prompt cmd and command prompt will be there. and also just click on python and version. It should display the version that you have downloaded. So here I have 3.13.7. So this is for your system and you can open Jupiter here also. So if you just type in Jupyter notebook, it will run an online server and start a Jupyter notebook. So as you can see every little small thing will be there in the command prompt. And also if you're running through this any Jupyter notebook that any changes that you do it will be reflected in the command prompt. So let me just show you this is the Jupyter notebook you'll be getting. If you create a new folder you can just create a Python notebook and as you can see it got updated and again it's a box platform and you can just code import pandas and run it. So before running it is star and after running it is one showing that the run is successful. Now here you can see every time I run a code or do some changes in this file the command prompt will be updating. Now to make sure if you don't want to install the python in the command prompt again go to extension t repeat type python and just click on python from Microsoft and you can just install this also where you will be able to start a jupyter notebook. Now how you'll be able to start a Jupyter notebook after installation is click on control shift and P and here you will see the different extensions that you want to start and if I say create new Jupyter notebook it has created one and again you can see a box pattern here same like Jupyter notebook and collab one is code markdown is basically the text run all and outline so here you can just change the name, rename it and you can just say import. And how do you run this? Just click on run. So if you just click on run, the kernel should start up. So it is taking some time to connect to the kernel, but it will run and you can start coding multiple things at a time here and use the text box to identify the heading. Now here you can see import has run. Now the third way of executing this Python is to use a Anaconda Navigator. Now how do you install that just Anaconda Navigator open that up the first link and you can just download if you just click on this it'll start downloading and then you can just click on the folder that was downloaded and start installing. Mind it will take many minutes to install the Anaconda. But after installation you will get Anaconda in your system. So when you open that it will take some time to open. So as you can see it is initializing. So we'll just give it time to start loading. Now why I'll be using Anaconda Navigator is it has Python, Jupiter and many data packages in one go. Anaconda is a distribution that installs Python, pulls data libraries such as NumPy, panel etc. and Anaconda Navigator a GUI launcher. From navigator you can open Jupyter notebook within a few clicks. Now if the Anaconda Navigator starts it'll look like this. So you can just install PyCharm and install toolbox etc. So you can just click on connect and you can sign in. So all your information will be there and let me just check because I had downloaded yeah here you can launch the Jupyter notebook directly from here. You also have Jupyter lab. So if you just click on launch it will start with a Jupyter notebook. Now as you can see a Jupyter notebook has opened. Again it's the same thing that we had opened via the command prompt and also the VS code. So again you can just create a Python and just run. It's the same interface. Let's just execute the same thing and it should run and there are no errors. So out of this three whichever is convenient to you you can start using them for data analytics. Here in this entire course I'll be using collab Google collab because you will guys will already have it online. And so you can just open it instead of waiting for the whole download process. Obviously you can download it and use other types of execution codes. The same code will be working in all the platforms. Now let's just run a small program in the notebook. So we can cross check it. First let's say a simple number and a string. Let's say age is equal to 25 and let's say course name is equal to python. Now since this is a string value it should be python for data analytics and course name. So here you can see age is 25 course name is python for data analytics it has printed. These are the names that stores values. You can reuse these things etc. Course name denote they denote that they are variables which carry a certain amount of value inside them and can be reused throughout the entire program. So let's just name them as variable and I don't want any code. Now the next type of initialization that we are going to do is list. So let's just see what list are. So we can just name it lists. And for the coding part, let's say monthly sales is equal to 120, 135, 150 and 160. So these are collection of data in one which is list. There are two types where you can collect data which is list and one is mutable and one is unmutable. And then let's just print this. So let it be monthly sales and just run it. So you can see it is 120, 135, 150, 160. Also if you observe a list is an ordered collection, we'll later convert this into pandas, series or even data frames. So make sure it is order. Next, let's take a look into a simple looping function which is useful for data analytics. Basically loop is a iterating factor. So first I'm using a for loop for looping. So we have for sale in monthly sale. That means any value from the set of monthly sales. Now here I have a loop which says for sale in monthly sales print monthly sales comma sales. That means for every month what is the sales value? It should be printing in a loop. It should print. So every value say it should give me monthly sale is equal to 120. Next monthly sale is equal to 135 etc. So let's just run this loop and see the output. So as you can see the first monthly sale is 20. The second monthly sale is 135 etc. Now let's just start with a quick analytical function very small one. So the first one is let's say total is equal to sum of monthly sales. Now here we are calculating the total amount of sale value which should be 20 + 50 + 60 all in together with one function which is sum. Let's also get the average. So let's say average sale is equal to total sale divided by length of it. So let's just say divided by l e n which calculates the length of the array of monthly sale. Now this gives the average for us. Let's print the total sales and also let's print the average. So let's just run this code. Let's see what the error is. So there's a spelling mistake. Total again there's spelling error. Let's just rectify that. And here you can see the total monthly sales is 565 which is addition of the array that we have given which is 120 35 150 and 60 and coming on to the average the average value will be 565 divided by 1 2 3 4 which will roughly give us 141.25. Now this is very small analytical part that is average and sum that we are using. Now before we get started let's just refresh our Python knowledge. So as we all know the first one is variables and what are the different kind of variables? It is int, float, string, list and dictionary. So these are the different kinds of variables. The second thing that you have to concentrate will be conditional looping. And the third thing will be functions. Now these are the three things that we'll be concentrating on. First the first thing is variables. Now we know that we are having int, float, string, list and d. So firstly we'll start with int. Now if I give month is equal to 1. This is an integer which is a whole number. Now if I want a pointwise number we need a float. Now if you don't give suppose month is equal to 1. This is also the same thing but python is so intelligent that it understands even without initiating this part. So even if I don't use in this will be initiated as int itself. Now similarly for float if I want to give some float value which is revenue is equal to say 1 2 3.5. Now revenue will be considered as a float. You can also give float revenue is equal to 1 2 3.5. This is also good enough but again you don't need to initialize it until it is necessary since Python is intelligent to figure it out itself. So after the float we have string. Now how do you initiate string? So let's say product name is equal to say my product is a pen and the pen is say gel. So you can just initialize the product name with double columns. Now keep in mind any character or any kind of string that you're initiating it should be this will be a string. Next we have list and dictionary which will come to it later. Now let's print all the things revenue and let's say I want product name also and let's just print it in an array format. So here you can see 1 12.35 and gel as a return product. Now what about list and also dictionaries. Now how do you initiate list? We have already done that which is monthly sales. Let's keep it small. Is equal to just have a box and give the number whatever number you want 20 or even 135 or even 150 and 90. Now this is your list. Now it's a little bit different for dictionary. So for that we'll say monthly. Let's say I'm taking summary is equal to curly braces. So since this is 2D array basically is a dictionary. So the first one will be month and followed by which month is the first one? It's January and let's after that the first initialization just give comma space and enter the second attribute sorry which is sales and what is the amount of sales that done in January which is 120 and next attribute I'll be using is target and then 100. Now if you run this and print monthly summary it should be a array as you can see first month January then we have sales 20 then we have target as 100 which is initialized. So for every month you can do the same thing. Now to summarize all the necessity things in Python that you should be knowing. Int is a whole number float is a decimal number str which is for string is for text. list for ordered collection and dictionary which is dict for key value data. Now moving on to the next which is the second part which is conditional loops. Now how do you execute conditional loops? Now we'll see a tiny list of monthly sales and print only the month that beats the target. Now here out of the entire set of monthly sales we need only few of the things that are crossing the target. Now how do you do that? If the condition performs then get inside the loop. That's how conditional looping or conditional loops work. So as usual we have monthly sales is equal to 120 135 150 comma 90. We going to reuse that itself. Let's set the target. I need only sales which are 100 and above. So for that the code should be for sale in monthly sales. This is the condition. And next we have if condition. So if sale is greater than 100 or target if you can set it. Let's set target and loop it. Then we have to print about [snorts] target sale. And let's print the sale amount. Now we have to initialize target here. So just click on enter and let's set the target which is target is equal to 100 and run the code. So there's an So as you can see here we have about target of sale is 120. So let's cross check with it. The first one is above 100. So 120 is above the sales target which is 120. 135 is above 150 is above and also 160 is also above. Now here you can see 190 sorry 150 and 90 is not printed because it is not above the target limit. Now if I changed my target to 150, there should be no values actually. So let's just run it. And there are no values which are above the target. So this is all about conditional looping. So for your simple understanding, if checks a condition for goes through each sale in the list and we only print when the condition is true. Now this is a simple single line looping, single conditional looping. Now we can move up to multiple amount of conditions and then loop it for our execution purposes. Next we'll discuss something called as en numerate. Now what is en numerate? First let's start coding and I'll explain you. So with the monthly sales, let's just add the month names which is month names is equal to let's say since it is a string Jan, Feb, March, April and let's end it up with May. These are the month names that I have initiated. Now let me finish this with for condition for I sale in and numerate monthly sales and I have a if condition which is if the sale is greater than the target which I'll be changing it back to 100 and then we have print month names with high in it and then comma hit the target in the box with the value. What is the difference between the previous one and here one? I have added enumerate. This basically prints the index. It tracks index as well as the month names together. So for that condition, we use enumerate. So let's change the target to 100 and let's run this and compare the result with using enumerate. Each month has been printed. So this section I've already given hit is also given and also with the value. Now en numerate prints month name of i. So it'll start with zero which is Jan. It checks with Jan's value. Jan's first value is 120 which hits the target. So we have 120. Then Feb, it goes with the second one which is I value will turn into one and then two and then three and then four. And this gives the values. So basically en numerate is index plus the value. This is pretty important because when you're doing data analytics, you will want the position which is the index to be captured. So you can compare multiple arrays or multiple columns etc. Now where you will use this enumerate instead of manually managing an index counter enumerate gives you both index and value in a clean way. The third thing that I talked about is functions for analytics. Now let's create a simple function to know how functions work. So let's start with initializing a simple function. So for function we have def space and the function name. So here I'll be calculating the discount and I'll pass a attribute which is revenue. So for any function it's like df the name and the functions or the values that we have to put inside for that particular block to execute. So after this what should the calculate discount functions do when it is called? Now here are the things that I will prefix so that we can accordingly code. First of all the function should have the following feature. So what are the things this function should have? returns the discounted amount based on the revenue. Let's this is the revenue. So this we can use using a conditional format which you already would have got the idea. Now coming to the second part what are the discounts for 10% if revenue is greater than or equal to 10,000. So the first discount is going to be 10% if the revenue is crossing 10,000 and similarly I will be providing a 5% discount on 5,000 and up and I won't provide any discount if otherwise so zero otherwise these are the things that we have to perform in the function called calculate discount. Now here the first line of code will be the if loop. So we are trying to see if the revenue is 5,000 10,000 in between etc. So the first loop will be if revenue is greater than equal to 10,000. Then what will be the discount rate? Discount rate will be equal to 10%. What is 10%? It's 0.10. Now we'll do a else if condition. So we have else if revenue is greater than equal to 5,000 discount rate will be equal to 0.05 05 and lastly we have else. What else [snorts] in else? Else it is zero, right? What should be the discount rate? Discount rate should be zero. And then we have return. We don't want the discount rate to be returned. So we need return revenue into the discount rate since we need a value. Okay. Now this is a module that we can reuse. Now we don't know what the revenue which we have given. So let's just run this code. So there is a indentation error. Let's pull this back and return is outside the function. So let's just fix that with space and run it. So as you can see there are no errors but this is a function for adding different revenue and getting the output. Now how do you get this output? How do you test this output? Again take a cell and then type in sample revenue. Now let the revenue be 12,000 and let's say discount. We need discount, right? Discount is equal to call the function. Now what is the function name? We have here calculate discount. This is the function name and we have to pass in the value. So what is the value here? It's the sample revenue. You can directly pass the value. But if you want it to be organized, you can assign it revenue. And at last, we are printing it. So we have print revenue and we are getting the return value from the sample revenue. And with that let's add the discount. So here let's see first of all I've given sample revenue as 12,000 and I'm then giving discount is equal to calling the function which calculates the discount here and we have a sample revenue which is passed so this 12,000 will be passed here and then we have print revenue so it should print revenue is equal to 12,000 the discount is 10% of 12,000 so let's just run this and here you can see revenue is 12,000 the discount is 10% of 12,000 which is,200. Again, if you want a different value, you just need to change this value or call the function whenever necessary without even changing the function here. So, this is the most important part of Python to use the functions to call the functions because a hell lot of code will be saved, a hell lot of coding time will be saved, etc. Now let's see the difference between traditional looping and how you can loop it better. So let's say this has tips for Pythonic. So I'll show you two different versions of looping. So you will understand how previously looping would have happened and how now the looping is happening for better references. So firstly let's say revenue is equal to 4,000 6,000 and 12,000. Now if I want to calculate the discount loop just first initiate discount loop which is equal to an empty array and then start with for r in revenue let's say discounts is equal to calculate discount let this be in a box in a loop now we need discount for r in revenue and close the bracket and then discount cut this start I'll start from for our revenue. So after that we have discount loop dotappend calculate discount of r close close the brackets and then we have discount loop this should be out of the for loop so just bring it out so this is a normal loop calculating the discount now the same thing I'll code using pythonic how do you do that first let's say discount is equal to let's say calculate discount again with R for R in revenue and close the braces and just discount. Now both the things perform the same function or functionality. The only difference is list comparison let you build a new list in one readable line instead of multiple lines with append. So here you can code efficiently with less number of lines. Now let's move on to the next concept which is en numerate which we have already discussed but here we'll do enumerate for reporting. Okay, this is not a text one. So let's just get a text one C. Let's just delete this. So we need enumerate for reporting. Now how do you report with enumerate for the previous problem? First start with the for loop for I, R. So here I'm using two of the variables in and enumerate. What are you supposed to do it? Revenue. Now check if the revenue is there. So you can see revenue is there. But have I executed this line? No. The both the results are also same. The discount is 0300 1200. I forgot to show you this people. Both the looping will give you the same result. But considering the amount of lines of coding and also the clarity, I feel Pythonic is much better for initial way. You can start practicing like that. Then you'll get used to Pythonic coding. Coming back for enumerate for reporting revenue. The first loop is done. I need to print. Now there is another function called f here which leaves the line f customer I + 1 and close the braces and revenue is equal to what is the revenue here r and comma discount which is equal to discount in i and let's close the braces and just run it there obviously a syntax error I have not initialized this and run it. So we do not want this. So let's just get rid of this and run it and recognize skin again here. Just delete this part and play. So here, as I told you before, every customer is named here. Every revenue is named here. Every discount is named here. This happens when you use enumerate. It'll pick the exact customer revenue and discount part. So here we have customer one revenue so much and discount is zero because it is less than 5,000. Second customer is 6,000 it's between 5 to 10k which gives you a discount of 5% of the amount which is 300. Again while considering customer three revenue is 1200 and discount will be 1200. So these are the important things about enumerate that you can use and have a specific value out of the list. Now let's move on to the next chapter which is nump. Now if you ask me why we have to use numpy in a re numpy is stored in contiguous memory enabling fast cle operation compared to python lists. So it is better than python list. Second reason is support element wise operations without looping which is vectorzation for most and then we have arithmetic and math functions included in numpy. So let's just see a quick tiny example of numpy. The first thing is to import numpy. So the standard format is import numpy as np. So if you can run it, it is having no errors. Now let's just see how to initialize daily revenue in thousands. Okay. So the first code will be here revenue is equal to np do and we have braces so that the array is initialized properly we have 10 12 9 give random numbers for practice 11 13 and 8. So this is a 1D array in numpy. So let's just say this is np and run it. So this is done. Let's perform a basic operation here with the first one will be revenue with GST. Now we all know that GST is calculated. Let's just calculate the revenue with the GST part. What should be the revenue GST part? It should be revenue the percentage of GST is going now. Right now it is 1.18. So this adds up 18% tax. Now how this adds up. So as you can see I've multiplied it 1.18. Now if you want to decrease it or you want to deduct it by some percentage which is 10 percentage you can give 0.90. So 1 is a whole number. If you want to increase or add the tax it is increased 1.18 or if you want to decrease it is decreased 0.90. I hope it is clear. And then let's give a discount price now. So discounted revenue is equal to let's say revenue into I'm giving 10% so it should be 0.9 so this applies 10% discount and next let's say sum in built in so we have total week revenue which is equal to it's just revenue dot sum what is this is sums is built function. So this will be sum and built-in function. This is a normal nump by basics. Some of the key points to notice is there is no loop in revenue which is 1.18 multiplies every element at once. That's called vectorization. So here when you give array every number will be multiplied 1.18 and then 0.9 and then included in the sum. This process even without looping is called vectorization and functions like sum, mean, minimum, max are built into numpy arrays and run very efficiently. If I have to do the same thing using loop without using numpy, it will take me three loops to get the same. I hope this basics of numpy is clear. Now let's move on to something called as indexing or slicing and also we'll do some part of reshaping. The next concept is indexing, slicing and reshape. Now indexing and slicing are same kind of thing where you are going to cut the array in whichever parts you want to. Suppose there is 10 arrays elements 0 to 9. You can cut it into five and five. Use the gaps. So we'll see an example for indexing. Firstly let's initialize an array with the help of np. So let's say revenue is equal to NP array and again we are using the same thing. So we can just copy and paste it. So here as you can observe it is seven. So we can consider it as a week pattern. So Monday, Tuesday, Wednesday, Thursday, Friday, Saturday and Sunday 7 days. So if I say revenue of zero, this is indexing. So this prints the first day which is a 10. So indexing starts from 0 1 2 3 4 5 and 6. So revenue of zero will give you the information about the first day. Then if I say revenue of minus1 now minus1 seems a little bit hectic but this will display the last day which is 15 here. Now why we are using minus1? We don't know the length of the so if we know the length of the array you can just length of the array minus one and you can also do revenue of six it'll give you the same value but if you don't know the length of the value you can use minus one and next if you want between one integer to another so for that we have a format so I want to print work days so for work days we are having revenue from which day to which day. So from zero index to the fifth index is the work piece. So it's that simple. You'll get the first five values which is 0 to 5. And what about weekends? Weekends is equal to revenue and what is it regarding this will be between five and above. Now we don't need to specify the last since the entire array is checked. So five and above will be considered. So this is more than enough. Now what if I want alternative values? It's again very simple. So let's say every other day which is equal to revenue. Now we'll consider an entire array. So let's just leave the starting and end name as it is and the gap should be every second number. So it is two. So this will give us the alternative. So let's just run this. So there is no error. So in a gist indexing starts at zero. Negative index counts from the end of the index. Slice index is start stop step. So the starting value, ending value and also the step count. Stop is excluded. So let's just print all these things and see. So let's say I want to print work days and weekend. So let's just fix it. And I need these both values. So as you can see we have Monday, Tuesday, witness day, Thursday and Friday which are the week days the first one and weekends are 8 and 15 revenue. Now let's also print every other day and paste it and run it again. So here you have every other day which is 10 9 13 and 15. So you can check these results in the same coding page. Now this is all about 1D array. Now what about 2D arrays? So for 2D array let's start giving an array of 2D. Now how do you initiate 2D array? Let's say 2 weeks of data which is equal to np dot array curly braces box. So here the first value should be as usual 10 12 9 11 13 8 and 15. You can use any kind of sampling data as you want to. So for the second array for the second day or second week I will be using different numbers. Let's say 9 11 10 12 14 9 and 16. So this is Monday, Tuesday, witness day, Thursday, Friday, Saturday, Sunday. And in week 2 again we are back to 9 11 which is Monday, Tuesday. So we are having two weeks of data here. So this is how you initialize 2D arrays. Now moving we'll look into indexing part of a 2D array. So let's just run this. So here let me just add a comma and run it again and it should be fine. Now how do you do indexing of a 2D array? So let's say we want week two only the week 2's data. So here we have revenue call the 2 w and it should be one comma index and leave. This will print the entire second week. Now since week two is done this is the second block. So how revenue week two is what the second part and then until the end. Next we have if you want to print Monday values we have month values which is equal to revenue 2 w let's say we don't want any indexing we want from both arrays from week one and week two. So there is no starting or ending of this. So, comma and then zero the first index of the second part. So, this will give Monday's value. So, I hope this is clear. So, let's just print week two and one values and just see if there's any errors or will we get the output. So here you can see the second part the second week is only displayed and then we have both the Monday's value which is from week 1 Monday and week 2 Monday. It is a little bit confusing at start but once you start understanding how indexing works it is pretty much easier to navigate via 1D or 2D array. Now indexing and slicing is done. Let's talk about reshaping. So reshaping is basically you have to convert 1D array into two to 3D array. So for that let's initialize the first 1D array. Let's say the flat array which is equal to flat is equal to np do array curly braces and then let's just say 1a 2a 3a 4a 5a 6 and let's just close the braces. This is a flat 1 array. Now how do you convert this into two rows and three columns. So let's give matrix is equal to the code for this is flat dot reshape to two rows and three columns. If you press enter now we have converted the array into a 2D array. Now let's just print flat comma matrix. Let's see the difference. As you can see flat is 1a 2a 3a 4a 5a 6. It's in one go which is 1D array. Whereas the second matrix you can see two rows and three columns which is 1 2 3 4 5 6. We have reshaped the entire array into the number of columns or rows that we had wanted. Now how do you get back the original array shape? So for that there's a simple code. Let's say back to flat. So here we have matrix dot reshape and given the amount of rows. So it's very simple for reshaping back again. So we will do reshape minus one. This reverses the process. So let's run the code and there is no error. Now reshape changes the view of some data without copying in many other cases. Let's say minus one lets numpy autoc calculate the remaining dimension. The second thing is tie to analytics which is reshaping is useful when you get the data as long as vector but needed it to be grouped by weeks, months or even features. So let's move on to the next basic math and stats function which is very much needed in data analytics. Now what are the basic functions which is average, sum. Now what are the basic average function? So first initially we use minimum, maximum, standard deviation, sum etc. So let's see how you can code with that. So for that let's again initialize the same. So let's just copy and paste the same thing. You can try it with different values also. And then we have for mean value it's revenue dot mean. That's all. You're done. You can calculate the mean. Mean is nothing but average. Now next one is median. Now here's a catch for median. Median is always recognized as function of the numpy. So let's say np dot median and pass the revenue. So this is how you can calculate the median. Next we have minimum value. So rest all is normal functions itself. So let's give revenue dot min. Then we have revenue dot max which is the maximum value out of the entire thing. And next we have revenue std which is a standard deviation. And next we have usual one revenue dot sum. So let's just execute this and see what are the results. So as we know we have printed only the integer types. So let's just print all the values. So just adding the print statement since we are not returning any values just calculating I need the print values. So let's just give the print values. So now let's execute and get the answers for it. Here you go. We have the mean value of the entire array. Medium of the array. Minimum value is 8. Maximum is 15. And then we have standard deviation which is 2.2. And also the sum of the entire array is 78. Now there is also function style for everything. Suppose we have calculated mean very really commonly. So if you want the numpy to get involved it will be np dot mean and pass the revenue. This applies to everything here even some standard deviation maximum minimum only median was I've given as np dot because it's a function of numpy exclusively. Now these were the same basic statistics analysis compute in Excel but now run instantly on large arrays. Many functions accept an axis argument for 2D arrays. So what is this axis? Let's figure out. So the next topic here will be axis. Now as of the axis we have already seen a 2D array which is week 1 week 2. So here the axis is equal to zero for week one and for week two it is one. This is the entire concept. So let's see a little bit of coding regarding that. So you'll have a clear picture. So firstly we have revenue which is 2 weeks dot mean value and ax is zero. This will calculate the mean of the first column. So this will be average across two days two weeks. Suppose this will be an average for Monday of first week and Monday of second week. Now if the same thing I do it for axis one. So this will be average across each week. So zero is weak and for each week it is 1. So if you run it now if you look at this the array will be 11.1428 and also 11.57. So this is across 2 days per entire two days like Monday and Monday of first week and second week. And this will be the entire array of the column. Now let's move on to data frames and series. Let's say data frames and series. Now what are data frames? Data frames are collection of datas. Now whatever we have done in 1D 2D there are millions probably of sets dealing with the entire data frame will be denoted as DF throughout the entire program. Now if I want to create a data frame, we can just say data is equal to let's say first day this is the first column which will have how many days Monday, Tuesday, Thursday, Friday etc. So let's just keep it with that. And next we have the sales value in column 2. So let's say sales is 120, 135, 150 and 90. Let's add one more column with sales units which will be 10 12 15 and 8. Here we have created a data frame. Now how do you initialize that? DF is equal to pandas pd dot data frame and we're passing the data in it. Now before that we have to initialize the pandas. So for pandas it is import pandas as v. Let's just run this code. There's an error. So let's just figure it out. There is no comma. So let's just fix that and run the code. As you can see it is converted to data frame. Now if I print DF it should be in a proper data frame format. So as you can see we have day Monday, Tuesday, Thursday and Friday and then we have sales as so and units as this is properly converted into a data set. Now this is helpful when you have files that have data but not in row format. You can just pass that via a data frame function from pandas. Now here are some points that you should remember. First of all, series has index plus values. Data frame has rows. Rows is nothing but index and columns which is the names or the values. Next we have preview repeat. Next we have preview with df dot head. DF do.shape df.info df.tescribe. Now df dot head will print the first initial five values and then df do.shape Shape will give you how many vectors are there. Suppose rows into columns. What is the shape of the table and next we have df.info. This gives the integer types of each column or row whichever you want you can calculate. And then we have df.t describe which gives the overall view of the table. Now importing data sets. So obviously we cannot form the entire data sets using manual typing. will have it in a CSV format, Excel format, etc. How do you load those data sets into a collab or even Jupyter notebook? Both are very similar ways. Let me just open a new notebook so that it's clear. So even in the Jupyter, it's the same format. So here you can see a document folder. So you can just click on it and here you can see an option called upload file. So you can just click on it and it will show you all the documents in your desktop. Any CSV file you can just download it. There are three major documents that usually data is given. One is the CSV file, one is the Excel file and one is the JSON format. How do you load the data set into a particular data? So here we have DF CSV. This is for initialization. So this is any name. So I can give df dot data is equal to pd read csv and then we have the name of the folder which is let's say sales data cv. So this is the format for csv folder. Now what about excel? For excel it is df excel is equal to pd dot read. Again we have Excel and again the name of the file which is I'll show you how to get the name of the file for now let's just consider this as sales data I can make sure it is in the inverters and then here if you want to mention any particular sheet name so there will be several sheets now you want to consider only one sheet so you can give sheet name is equal sheet one you can rename the sheet and change the sheet name here also. Now coming to file, how do you get a JSON file format? So here it will be PD read again the name of the JSON file. So let's say sales data dot. Now here this won't work because this is a format. So let's just change into Excel sheet for Excel. This will be X LX. Now here mostly I'll be using Excel file but also works you can use with Excel data as we
Original Description
🔥Professional Certificate in AI and Machine Learning - https://www.simplilearn.com/professional-aiml-program?utm_campaign=xH4fDzz6vXU&utm_medium=DescriptionFirstFold&utm_source=Youtube
🔥IIT Kanpur - Professional Certificate Course in Generative AI and Machine Learning - https://www.simplilearn.com/iitk-professional-certificate-course-ai-machine-learning?utm_campaign=xH4fDzz6vXU&utm_medium=DescriptionFirstFold&utm_source=Youtube
🔥IITM Pravartak - Advanced Executive Program In Applied Generative AI - https://www.simplilearn.com/applied-generative-ai-course?utm_campaign=xH4fDzz6vXU&utm_medium=DescriptionFirstFold&utm_source=Youtube
This video data analytics with Python for 2026 by Simplilearn explains how Python is used as a powerful tool for modern data analytics and why it is widely preferred across industries. The video introduces the role of data analytics in today’s data-driven world and shows how Python helps in collecting, cleaning, analyzing, and visualizing data to generate insights. It highlights how organizations use analytics to understand customer behavior, track performance, identify trends, and support business decisions. The video also explores Python libraries such as NumPy for numerical computations, Pandas for data manipulation and cleaning, and Matplotlib and Seaborn for data visualization. It explains how these tools simplify data tasks and improve efficiency when working with large datasets. In addition, the video demonstrates key analytics workflow steps including data preparation, exploratory data analysis, visualization, and automation techniques that help analysts generate insights faster.
Following are the topics covered in this tutorial on Data Analytics With Python 2026 :
00:00:07 Course Introduction and Role of Data Analyst
00:10:18 Setting Up Python Analytics Environment
00:20:13 Python Fundamentals for Data Analysis
00:58:40 Data Cleaning, Visualization and Exploratory Data Analysis
01:27:44 Real-World Project, Automation, and Ca
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Simplilearn · Simplilearn · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Ethical Hacking Full Course 2026 | Ethical Hacking Course for Beginners | Simplilearn
Simplilearn
AWS Full Course 2026 | AWS Cloud Computing Tutorial for Beginners | AWS Training | Simplilearn
Simplilearn
Data Structures And Algorithms Full Course | Data Structures and Algorithms Tutorial | Simplilearn
Simplilearn
SQL Full Course 2026 | SQL Tutorial for Beginners | SQL Beginner to Advanced Training | Simplilearn
Simplilearn
Microsoft Azure Full Course 2026 | Azure Tutorial for Beginners | Azure Training | Simplilearn
Simplilearn
Shopify Tutorial For Beginners 2026 | Shopify Course | shopify dropshipping | Simplilearn
Simplilearn
Six Sigma Full Course 2026 | Six Sigma Green Belt Training | Six Sigma Training | Simplilearn
Simplilearn
🔥Feeling Stuck? How Upskilling Can Boost Your Career! #shorts #simplilearn
Simplilearn
Growth Hacking In Marketing | Learn Growth Hacking Marketing Strategies | Simplilearn
Simplilearn
🔥Cracked 3 Job Offers with One AIML Course! | 20–30% Salary Hike #shorts #simplilearn
Simplilearn
Top 10 Must-Have Figma Plugins for UI/UX Designers in 2026 | Figma Plugins | Simplilearn
Simplilearn
Business Analytics Full Course 2026 | Business Analytics Tutorial For Beginners | Simplilearn
Simplilearn
Simplilearn Reviews | Getting future-ready with course in Artificial Intelligence | Roopam’s story
Simplilearn
Generative AI Full Course 2026 | Gen AI Tutorial for Beginners | Gen AI Explained | Simplilearn
Simplilearn
Full Stack Developer Course 2026 | Full Stack Java Developer Tutorial for Beginners | Simplilearn
Simplilearn
Simplilearn Reviews | How David Went From Seasoned Engineer to AI Innovator #GetCertifiedGetAhead
Simplilearn
Complete Social Media Marketing Strategy for 2026 | Social Media Marketing Strategy | Simplilearn
Simplilearn
🔥Top 4 Cybersecurity Certifications You Need! #simplilearn #shorts
Simplilearn
🔥Cloud Engineer Salary in India 2026 | City-Wise Breakdown #shorts #simplilearn
Simplilearn
Digital Marketing Full Course 2026 | Digital Marketing Tutorial For Beginners | Simplilearn
Simplilearn
Full Stack Java Developer Course | Full Stack Java Developer Tutorial for Beginners | Simplilearn
Simplilearn
Social Media Marketing Full Course | Social Media Marketing Tutorial For Beginners | Simplilearn
Simplilearn
How To Create LLM Chatbot Demo 2026 | Build a LLM Chatbot From Scratch | Simplilearn
Simplilearn
Digital Supply Chain Management Certification | Supply Chain Management Course | Simplilearn
Simplilearn
AI Agents Full Course 2026 | AI Agents Tutorial for Beginners | How to Build AI Agents | Simplilearn
Simplilearn
ITIL Full Course 2026 | ITIL 4 Foundation Course | ITIL Tutorial For Beginners | Simplilearn
Simplilearn
Generative AI Full Course 2026 | Gen AI Tutorial for Beginners | Gen AI Explained | Simplilearn
Simplilearn
ITIL Full Course 2026 | ITIL 4 Foundation Course | ITIL Tutorial For Beginners | Simplilearn
Simplilearn
Simplilearn Reviews | Integrating AI & Music | Diego's Story
Simplilearn
Digital Marketing Full Course 2026 | Digital Marketing Tutorial For Beginners | Simplilearn
Simplilearn
SEO Full Course 2026 | SEO Tutorial for Beginners | SEO Training | SEO Explained | Simplilearn
Simplilearn
PMP Vs CAPM: Which Certification Should You Choose? | PMP Vs CAPM | Simplilearn
Simplilearn
Complete Data Analyst Roadmap 2026 | How To Become A Data Analayst In 2026 | Simplilearn
Simplilearn
Generative AI Full Course 2026 | Gen AI Tutorial for Beginners | Gen AI Explained | Simplilearn
Simplilearn
🔥5 Jobs That Are Most Likely Safe from Layoffs in Today’s Market #shorts #simplilearn
Simplilearn
🔥Git vs GitHub – What's the Difference?
Simplilearn
What Goes Behind Building the Likes of Uber and Netflix? | Product Management Tutorial | Simplilearn
Simplilearn
AI Agents Full Course 2026 | AI Agents Tutorial for Beginners | How to Build AI Agents | Simplilearn
Simplilearn
Full Stack Developer Course 2026 | Full Stack Java Developer Tutorial for Beginners | Simplilearn
Simplilearn
Product Life Cycle 2025 | Stages Of Product Life Cycle | Product Life Cycle Tutorial | Simplilearn
Simplilearn
Project Management Full Course 2026 | Project Management Tutorial | PMP Course | Simplilearn
Simplilearn
PCB Design Course 2025 | PCB Designing Explained | How To Make PCBs | Simplilearn
Simplilearn
Python Full Course 2026 | Python Data Analytics Tutorial For Beginners | Simplilearn
Simplilearn
🔥Top Product Management Skills You Need to Succeed in 2026 #shorts #simplilearn
Simplilearn
SQL For Data Analytics 2026 | Essential SQL Commands | SQL Tutorial For Beginners | Simplilearn
Simplilearn
Simplilearn Reviews | Paving Way To Success With AI & ML Course | Soumik’s Upskilling Journey
Simplilearn
Six Sigma Full Course 2026 | Six Sigma Green Belt Training | Six Sigma Training | Simplilearn
Simplilearn
Learn Snowflake In 45 Mins | Snowflake Tutorial | What Is Snowflake | Snowflake Explained
Simplilearn
🔥ML Career Tip – How to Start Learning Machine Learning in 60 Seconds! #shorts#simplilearn
Simplilearn
🔥Agile vs Waterfall in 60 Seconds #shorts #simplilearn
Simplilearn
Excel Full Course 2026 | Excel Tutorial For Beginners | Microsoft Excel Course | Simplilearn
Simplilearn
What Are AI Agents? | Types Of AI Agents | AI Agents Explained | AI Agents Tutorial | Simplilearn
Simplilearn
How To Create a Product Roadmap In 2026 | Product Roadmap | What Is Product Roadmap | Simplilearn
Simplilearn
SQL Full Course 2026 | SQL Tutorial for Beginners | SQL Beginner to Advanced Training | Simplilearn
Simplilearn
🔥What Is Phishing? #shorts #simplilearn
Simplilearn
Cloud Computing Full Course 2026 | Cloud Computing Tutorial | Cloud Computing Course | Simplilearn
Simplilearn
Simplilearn Reviews | Overcoming Rejection & career plateau to finding a New Job : Bhaskar Banerji
Simplilearn
Six Sigma Full Course 2026 | Six Sigma Green Belt Training | Six Sigma Training | Simplilearn
Simplilearn
Generative AI Full Course 2026 | Gen AI Tutorial for Beginners | Gen AI Explained | Simplilearn
Simplilearn
VLSI Design Course 2026 | VLSI Tutorial For Beginners | VLSI Physical Design | Simplilearn
Simplilearn
Related AI Lessons
⚡
⚡
⚡
⚡
The HiPPO is always right
Dev.to · Sharmin Sirajudeen
How to Extract Saudi Arabia Property Data Across Bayut.sa, Wasalt.sa, Aqar.fm and PropertyFinder.sa
Dev.to · Omar Eldeeb
Norway vs France (1:4) — A 97% Crime Index Anomaly: When Ruthless Efficiency Buries the xG Evidence
Medium · Data Science
How to Build an H-1B Salary Database by Employer (the Real Data Source + Python)
Dev.to · Omar Eldeeb
Chapters (5)
0:07
Course Introduction and Role of Data Analyst
10:18
Setting Up Python Analytics Environment
20:13
Python Fundamentals for Data Analysis
58:40
Data Cleaning, Visualization and Exploratory Data Analysis
1:27:44
Real-World Project, Automation, and Ca
🎓
Tutor Explanation
DeepCamp AI