Rolling statistics - p.11 Data Analysis with Python and Pandas Tutorial

sentdex · Beginner ·🛠️ AI Tools & Apps ·10y ago

Key Takeaways

This video tutorial series demonstrates the application of rolling statistics using Pandas in Python, covering concepts such as moving average, standard deviation, and correlation analysis for data analysis and investment strategy implementation.

Full Transcript

what's going on everybody Welcome To Part 11 of our data analysis with python and Panda tutorial Series in this part we're going to be talking about rolling statistics and uh other things that we can do in in a rolling fashion so first of all what we should do is first bring up over here here we go there will be a link uh again you'll have to go to the text based tutorial there link to this in that tutorial right at the top basically these are all the things that we can do uh in a rolling Manner and really important is this one here but we'll get to that so first of all uh we can do things like let me just zoom in here I got Zoom didn't got lost anyway rolling count rolling sum rolling mean all that kind of stuff we'll check out the rolling mean you can also do rolling minimum what does this mean so with rolling what you do is you take a window of time and then basically in that window of time we can do all of these things so that's rolling sum so in that window of time add all the values together in that window of time calculate the average of everything and keep moving that way uh Max standard deviation and so on and then with rolling apply you can write your own function that deals with window data and do anything so if all these don't suit whatever purpose you might have rolling apply is your guy that was good anyway um I'm here all night moving this over here uh let's go ahead and calculate a ru bulling mean so this is also known as a moving average so uh we're going to go ahead and we can leave this stuff here Tex instead of Texas one year this is what we kind of did before uh we're going to redefine Texas one year and actually instead of Texas one year let's call it TX uh tx12 ma or something like that that'll do so tx12 Ma and then I'm going to copy this and then I'm going to do a replace so contrl H that's a find we're going to find all tx1 years and just replace it with tx12 ma replace all good to go uh just to check to make sure that went went correctly cool all right let's close this out now we're going to do instead of resampling here and filling in a we don't need to do that right now what we're going to do is do this we don't need to print that head here we'll print that after that okay so tx12 ma what we're going to do is not resample what it's going to be instead is a moving average so we're going to do pd. rolling uncore mean of this is the data we're going to do the rolling mean for it and then we can choose uh for how much time so we'll do 12 so that would be 12 data points it just so happens each data point is a month so this would be a 12 moving average otherwise known as a year so 12 months or a year so um that creates a new col blah blah blah we'll graph it here we go and there you have it that's a 12 uh 12- month moving average again we're forfeiting the little squiggles which actually are highly valuable to us because it shows us every 6 months basically we go through a cycle uh like clockwork moving this aside oh one thing you will notice uh actually that right here uh you start with noted numbers why do we have noted numbers there uh the problem is we have 12 moving average you can't calculated 12 moving average on data point number five just not possible so a lot of times what I'll do is especially in the same data set the other thing you'll notice let's run that one last time the it starts a little later right so if we zoom into this point you'll see that this started after these points now if there's a significant delay there it becomes really obvious so inste 12 we could do 120 so 120 months that's a lot of months but then you wind up with something like this right that's kind of a problem right that looks silly so this would be a scenario where maybe if you want everything to line up nice and pretty you could do uh HPI data. drop drop na uh and then we'll of course need to do an in place equals true save and run that and then now it's chopped off all the old data and stuff like that and anyway so it's a moving average we're not going to continue dropping in and uh we'll change this back to a 12 retain the data here whoops so that would be a moving average another good one is uh the standard deviation so standard deviation is good for a lot of reasons one it can help us actually uh identify problem points and outliers but it can also help us uh kind of detect volatility in the market so as things have a greater standard deviation that means things are moving around a lot more so that's good good indication of volatility so uh what we would do is we can kind of sit all this up here uh and let's create a new column and this column will be um let's just copy this copy paste instead of ma that's STD instead of rolling mean it's a rolling STD and then uh everything else can stay the same we we'll do a a 12 month standard deviation now the problem with standard deviation is it's not uh in the same kind of for I'm trying to think of the right word uh scale there it is it's not at the same scale as um the housing price index okay so standard deviation is how much deviation so it's almost always going to be a small fraction of the original values so what we need to do is actually graph it on a different graph we can go ahead and see the show that this creates tx12 STD and let's go ahead and add that into the printing uh up here as well so we'll save and run that and as you can see this is the original data and this is the standard deviation down here obviously I mean you can kind of tell the fluctuations in the standard deviation whoops wrong button um but we would like to see on a greater scale right the obvious increase here decrease here increase here and so on so what could we do well you can close this and we can graph it on a new graph entirely again this isn't totally a matap plot Li to series but we're um so we're going to kind of run through this this is covered in the more in-depth data visualization series but uh you could probably pick it up real quick now so these are your axes this is a okay so you got a figure and then You' got subplots on a figure and a subplot is also your axis basically so we're going to create a new axis copy paste the the grid now is going to be a 2X one that would be two tall one wide so we'll have a graph on top and a graph on bottom this one will start at 0 0 totally fine this one will start at 1 Z and it's also going to share the x axis of AIS one that means we can zoom in to any either of the graphs and both graphs will zoom in for us nice and neat and this is now ax2 for axis 2 then what we're going to do is uh not plot that there we'll come down here and what we're going to say is uh HPI data we still do want to plot it just in different on a different axes HPI data tx12 ma that's what we want to plot oops do plot and this one we're going to plot axal ax dose so now we'll go ahead and uh run that save and [Music] run what did we do we did ma instead of STD I guess we sure did why didn't anybody tell me let's run that one more time there there we go okay so there you have it and this is you know now standard deviation and again there this is why we share the axes because we could zoom into this point and both graphs zoom in to that exact uh point so uh that's pretty nifty because you know you can see you know here as prices are going up sure enough the volatility is going up we we've seen that and then up here we're really starting to rock it up and we are picking that up on the uh standard deviation graph so that's pretty cool so we'll close this out uh another pretty nifty one is like rolling correlation stuff like that so what we could do is this one's a little more tricky but we'll keep the figure here uh we can plot on the two I suppose we'll do let's delete all the way here so what we're going to do is we'll run this we'll say TX to AK 12 correlation so this is measuring we're going to measure the correlation between these guys that'll be pd. rolling correlation and we want to apply this to the HPI unor data of Texas so again it's Texas 2 a uh AK so which I'm if I'm recalling right it's actually Alaska so Texas and then is the first one and then we'll go HPI uncore [Music] data oh my gosh I cannot type AK and then for 12 periods okay so that's going to calculate the correlation but again that these need to be on a separate graph okay so Texas and Arkansas can be on the same graph though so we could do something like this we could say HPI data TX whoops and TX needs to be caps do plot a x equal ax1 label will be equal to TX HPI and then we'll take this copy come down here paste say Texas AK and then over here AK uh so that's good then we're going to say a x1. legend location 4 and then we'll come down here and then we're going to say txor ak12 core good do plot and then we'll say ax = ax2 label equals and we'll say uh let's just take this here copy paste so this is a rolling correlation uh between these two so what would you do with this well as we noted before uh the the correlation between all states is extremely high we can see that you know the average correlation here is quite High it keeps coming up to this point uh so at every point we can see that oh my goodness the correlation drops to almost minus one here basically pretty much hits it uh so we know based on the you know last 40 Years of correlation data that we pulled up we pulled up that correlation table not long ago we know that every state basically follows every state they all follow the same housing market no matter what so with correlation if you have correlation in the negatives that means they're trending in different directions so what you want is you want to find scenarios where correlation is in the negatives and you're going to in the most ideal scenario you would find a way to short like in this scenario here um Alaska is is all the way down here in the dumps and whereas Texas keeps going up so what you would do to be a completely Market neutral strategy is you would short uh Texas and you would buy Arkansas now there's it's really kind of difficult to short let's say the Alaska housing market so at the very least what you would do is you would buy Alaska I'm not sure if I keep saying Arkansas or not but if I say Arkansas AK is Alaska anyway uh so really what you would do is you know in reality you would just buy Alaska every time it dips and sure enough every time like if you buy Alaska here you bought Alaska here good for you you buy Alaska here that means you bought it here and you probably did okay goes up slightly but then you would get out of it once it returns back to the the typical one then again you would buy Alaska here you'd probably continue hel holding it through here which is fine and then basically by this point you sold again good for you you just like doubled your money or more than double your money and then basically no other time did it drop down to you know that one but that's kind of what you would be looking for you would be looking for when States either defer from another state if you were if you could find a way to pair trade the states or you would just find all states that diverge from the uh housing price index that have a negative value that's where you would buy your your next house or something like that or invest in some property that would be the idea anyway so that is rolling uh correlation and uh as far as or really rolling rolling statistics so a bunch of rolling statistics here but rolling correlation at least in terms of the housing price index and housing market and stuff obviously makes the most sense again we already tested the correlation between every state and every state and then we could also test the correlation between every state and housing price index uh and we found that I mean they're all bit like the worst was actually still a positive 74% so anytime you've got Divergence to this degree you buy whatever's you know diver that much I mean it's pretty I mean at least in the last 40 years that's true you honor that and you will go for that just makes sense so anyway uh that's it for this tutorial questions comments whatever leave them below otherwise as always thanks for watching thanks for all the support subscriptions until next time

Original Description

Welcome to another data analysis with Python and Pandas tutorial series, where we become real estate moguls. In this tutorial, we're going to be covering the application of various rolling statistics to our data in our dataframes. One of the more popular rolling statistics is the moving average. This takes a moving window of time, and calculates the average or the mean of that time period as the current value. In our case, we have monthly data. So a 10 moving average would be the current value, plus the previous 9 months of data, averaged, and there we would have a 10 moving average of our monthly data. Doing this is Pandas is incredibly fast. Pandas comes with a few pre-made rolling statistical functions, but also has one called a rolling_apply. This allows us to write our own function that accepts window data and apply any bit of logic we want that is reasonable. This means that even if Pandas doesn't officially have a function to handle what you want, they have you covered and allow you to write exactly what you need. Let's start with a basic moving average, or a rolling_mean as Pandas calls it. You can check out all of the Moving/Rolling statistics from Pandas' documentation. Text tutorial and sample code: http://pythonprogramming.net/rolling-statistics-data-analysis-python-pandas-tutorial/ http://pythonprogramming.net https://twitter.com/sentdex
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from sentdex · sentdex · 0 of 60

← Previous Next →
1 Matplotlib Python Tutorial Part 1: Basics and your first Graph!
Matplotlib Python Tutorial Part 1: Basics and your first Graph!
sentdex
2 Python Encryption Tutorial with PyCrypto
Python Encryption Tutorial with PyCrypto
sentdex
3 Python's Logging Function
Python's Logging Function
sentdex
4 wxPython Tutorials 1: Making Windows GUIs with Python : Installing + 1st window!
wxPython Tutorials 1: Making Windows GUIs with Python : Installing + 1st window!
sentdex
5 wxPython Tutorials 2: Making Windows GUIs with Python: Customizing Window Parameters
wxPython Tutorials 2: Making Windows GUIs with Python: Customizing Window Parameters
sentdex
6 wxPython Programming Tutorial 3: Menu Bar and Menu Button
wxPython Programming Tutorial 3: Menu Bar and Menu Button
sentdex
7 wxPython Programming Tutorial 4: Panels
wxPython Programming Tutorial 4: Panels
sentdex
8 wxPython Programming Tutorial 5: User Input Saved To Variables
wxPython Programming Tutorial 5: User Input Saved To Variables
sentdex
9 wxPython Programming Tutorial 6: Multiple Choice Input
wxPython Programming Tutorial 6: Multiple Choice Input
sentdex
10 wxPython Programming Tutorial 7: Adding Static Text and Colors
wxPython Programming Tutorial 7: Adding Static Text and Colors
sentdex
11 wxPython Programming Tutorial 8: Custom Button Images
wxPython Programming Tutorial 8: Custom Button Images
sentdex
12 wxPython Programming Tutorial 9: Tool Bar Items and Sub Menus!
wxPython Programming Tutorial 9: Tool Bar Items and Sub Menus!
sentdex
13 Basic PHP Tutorial 13: Multi-dimensional Array
Basic PHP Tutorial 13: Multi-dimensional Array
sentdex
14 Basic PHP Tutorial 15: Functions and Global Variables
Basic PHP Tutorial 15: Functions and Global Variables
sentdex
15 Basic PHP Tutorial 12: Associative Array
Basic PHP Tutorial 12: Associative Array
sentdex
16 Basic PHP Tutorial 14: Foreach loop
Basic PHP Tutorial 14: Foreach loop
sentdex
17 Basic PHP Tutorial 16: Include and Require
Basic PHP Tutorial 16: Include and Require
sentdex
18 Basic PHP Tutorial 7: Assignment, comparison and Logical operators
Basic PHP Tutorial 7: Assignment, comparison and Logical operators
sentdex
19 Basic PHP Tutorial 4: Variables and Comments
Basic PHP Tutorial 4: Variables and Comments
sentdex
20 Basic PHP Tutorial 11: Arrays part 1, basic array
Basic PHP Tutorial 11: Arrays part 1, basic array
sentdex
21 Basic PHP Tutorial 6: If else and else if conditionals cont'd
Basic PHP Tutorial 6: If else and else if conditionals cont'd
sentdex
22 Basic PHP Tutorial 1: Intro to PHP
Basic PHP Tutorial 1: Intro to PHP
sentdex
23 Basic PHP Tutorial 3: HTML with PHP
Basic PHP Tutorial 3: HTML with PHP
sentdex
24 Basic PHP Tutorial 9: While Loop
Basic PHP Tutorial 9: While Loop
sentdex
25 Basic PHP Tutorial 10: Switch Statement
Basic PHP Tutorial 10: Switch Statement
sentdex
26 Basic PHP Tutorial 2: Print and Echo
Basic PHP Tutorial 2: Print and Echo
sentdex
27 Basic PHP Tutorial 5: If else and else if conditional statements
Basic PHP Tutorial 5: If else and else if conditional statements
sentdex
28 Basic PHP Tutorial 8: Arithmatic Operators: Doing math with php
Basic PHP Tutorial 8: Arithmatic Operators: Doing math with php
sentdex
29 Basic PHP Tutorial 17: User Input Form Example / String Manipulation
Basic PHP Tutorial 17: User Input Form Example / String Manipulation
sentdex
30 Basic PHP Tutorial 18: HTML Entities and forms cont'd
Basic PHP Tutorial 18: HTML Entities and forms cont'd
sentdex
31 Basic PHP Tutorial 19: Finding words in strings
Basic PHP Tutorial 19: Finding words in strings
sentdex
32 Basic PHP Programming Tutorial 20: Saving to a File / writing and appending
Basic PHP Programming Tutorial 20: Saving to a File / writing and appending
sentdex
33 Basic PHP Programming Tutorial 22: Hashing part 2: salting
Basic PHP Programming Tutorial 22: Hashing part 2: salting
sentdex
34 Basic PHP Programming Tutorial 23: Variables in Strings and tokenizing
Basic PHP Programming Tutorial 23: Variables in Strings and tokenizing
sentdex
35 Basic PHP Programming Tutorial 21: MD5 Hashing For Security
Basic PHP Programming Tutorial 21: MD5 Hashing For Security
sentdex
36 Basic PHP Programming Tutorial 24: String similarity
Basic PHP Programming Tutorial 24: String similarity
sentdex
37 Basic PHP Programming Tutorial 25: Time and Time stamps
Basic PHP Programming Tutorial 25: Time and Time stamps
sentdex
38 Basic PHP Programming Tutorial 26: Die and Exit
Basic PHP Programming Tutorial 26: Die and Exit
sentdex
39 Basic PHP Programming Tutorial 27: MySQL Databases Part 1
Basic PHP Programming Tutorial 27: MySQL Databases Part 1
sentdex
40 Basic PHP Programming Tutorial 28: MySQL Database Part 2: Reading From Database
Basic PHP Programming Tutorial 28: MySQL Database Part 2: Reading From Database
sentdex
41 Basic PHP Programming Tutorial 29: MySQL Database Part 3: Inputting Data
Basic PHP Programming Tutorial 29: MySQL Database Part 3: Inputting Data
sentdex
42 Basic PHP Programming Tutorial 30: MySQL database in Use
Basic PHP Programming Tutorial 30: MySQL database in Use
sentdex
43 Django Tutorial Web Development with Python Part 1: Installing Django
Django Tutorial Web Development with Python Part 1: Installing Django
sentdex
44 Python Tutorial: File Deletion and Folder Deletion / directory deletion
Python Tutorial: File Deletion and Folder Deletion / directory deletion
sentdex
45 Python Tutorial: How to Rename Files and Move Files with Python
Python Tutorial: How to Rename Files and Move Files with Python
sentdex
46 3D Graphs in Matplotlib for Python: Basic 3D Line
3D Graphs in Matplotlib for Python: Basic 3D Line
sentdex
47 3D Plotting in Matplotlib for Python: 3D Scatter Plot
3D Plotting in Matplotlib for Python: 3D Scatter Plot
sentdex
48 3D Charts in Matplotlib for Python: Multiple datasets scatter plot
3D Charts in Matplotlib for Python: Multiple datasets scatter plot
sentdex
49 Sikuli Tutorial 1: Visually programming in python!
Sikuli Tutorial 1: Visually programming in python!
sentdex
50 Sikuli Tutorial 2: Program visually in python!
Sikuli Tutorial 2: Program visually in python!
sentdex
51 Sikuli Tutorial 3: Program visually in python!
Sikuli Tutorial 3: Program visually in python!
sentdex
52 3D Bar Charts in Python and Matplotlib
3D Bar Charts in Python and Matplotlib
sentdex
53 3D Plane wire frame Graph Chart in Python
3D Plane wire frame Graph Chart in Python
sentdex
54 Raspberry Pi Part 1 Introduction
Raspberry Pi Part 1 Introduction
sentdex
55 Raspberry Pi Part 8: First Download and Update! (Firmware)
Raspberry Pi Part 8: First Download and Update! (Firmware)
sentdex
56 Raspberry Pi Part 10: How to set up a Linux Web Server on your Pi
Raspberry Pi Part 10: How to set up a Linux Web Server on your Pi
sentdex
57 Raspberry Pi Part 11: Remote Desktop
Raspberry Pi Part 11: Remote Desktop
sentdex
58 Twitter Analysis: How to rank a user's influence
Twitter Analysis: How to rank a user's influence
sentdex
59 GPIO Tutorial for Pi Part 2 - Programming the GPIO
GPIO Tutorial for Pi Part 2 - Programming the GPIO
sentdex
60 GPIO Tutorial for Raspberry Pi Part 1 - Setting up
GPIO Tutorial for Raspberry Pi Part 1 - Setting up
sentdex

This video tutorial series teaches how to apply rolling statistics using Pandas in Python for data analysis and investment strategy implementation, covering concepts such as moving average, standard deviation, and correlation analysis. By following this tutorial, viewers can learn how to calculate rolling mean, analyze correlation between data sets, and identify trends in data. The tutorial provides a hands-on approach to data analysis, making it practical for beginners.

Key Takeaways
  1. Calculate a rolling mean for 12 months
  2. Define a new column as a rolling mean
  3. Copy and replace values in a column
  4. Create a new column for rolling standard deviation
  5. Plot the rolling standard deviation on a new graph
  6. Run rolling correlation between Texas and Alaska housing prices
  7. Plot Texas and Alaska housing prices on the same graph
  8. Short Texas and buy Alaska to implement a market-neutral strategy
💡 Rolling statistics can be used to identify trends in data, detect outliers, and implement market-neutral strategies, making it a valuable tool for data analysis and investment decision-making.

Related Reads

Up next
Claude Tag Is Dangerous for Your Business
Leveling Up with Eric Siu
Watch →