Granger Causality in Python : Data Science Code

ritvikmath · Advanced ·📐 ML Fundamentals ·5y ago

Key Takeaways

This video demonstrates how to perform a Granger Causality test in Python using the statsmodels library, specifically the grangercausalitytests function, to determine if one time series causes another.

Full Transcript

hi everyone welcome back so this is going to be a very very short video just showing you how to do the granger causality test in python so i had a whole video on the theory of the grincher causality test and i'll link that in the description below just in a nutshell again what do we use granger causality for so a lot of times we'll have two or more time series we'll be interested in if one of the time series causes another time series and just to give a real world example let's say we have one neighborhood and another neighborhood that's right next to it we might have a case where house prices in one neighborhood change and then house prices in the other neighborhood change in a similar way except it takes one or two months for that change to be seen so in that case the house price in one neighborhood is causing the house price in the other neighborhood now it's actually pretty difficult to figure out this causality it takes a lot of very difficult work so we use kind of a proxy called granger causality so this just looks at the two time series and sees if we can use a shifted version of one of the time series to well predict one of the other time series so again the theory video is linked in the description below this is just a very short code video the only special library you're going to need is from statsmodels.tsa.statstools import granger causality tests so as you might have guessed we're not writing any of the code ourselves it's already been written i will just show you how to properly call this function and how to interpret the results so we'll also be using some simulated data for today just to make things easier and you don't have to download any data yourself so here i've just generated a very simple ar1 process so the coefficient is 0.5 really nothing special going on here and that's called t1 and then t2 is going to be the same time series as t1 except i add a little bit more random noise at each step just to make this more realistic looking and this cell right here is basically shifting the time series by three months or weeks or whatever the unit of time you want to consider is so that time series t2 now kind of follows the same signature as time series t1 except it's shifted three months into the future the easiest way to see that is in this graph so t1 is the blue curve and t2 is the red curve so although it's a little bit difficult to see if you look at the red curve that it is roughly a shifted version of the blue curve for example look at this peak in the blue curve here you see that three months later we see a similar peak in the red curve here look at this spike in the blue curve here you see that in the red curve we see a similar spike about three months later so of course there is some random noise added to the red curve so it's not an exact copy three months shifted but it is a rough copy three months shifted so how do we pick up on that using code so first we need to pack these two time series into a pandas data frame and here is a very very important part you're going to have two columns in your data frame so those are going to be your two time series you want to make sure to put the first column as the time series that you think is being granger caused by the second one so here what i'm doing is checking whether time series t2 is granger caused by time series t1 therefore t2 goes in the first position the first column and t1 goes in the second column constructed that data frame here just two columns one with time series t1 and the other with time series t2 and then our hard work is done all we have to do is call the granger causality tests function put in our data frame and here you can put in the number of lags that you want to check so by putting in three i'm saying go ahead and check if it grains your causes at one lag at two lags and three lags so you can do as many as you want and let's first look at what happens if we consider one lag so it runs these four tests and we can look at the p values which are kind of in the same ballpark and they're all pretty far away from point zero five so we can confidently say that t2 is not granger caused by t1 looking at just one lag alone let's look at if we look at two lags here these p values are getting lower but still kind of far away from low values like .05 so we can also say that time series t2 is not granger caused by t1 looking at two lags only now here's where it's interesting if we look at three lags we see that all the p values go to zero or virtually zero so we have very strong evidence to say that time series t2 is grained or caused by time series t1 by looking at three lags and that's exactly the way we constructed it and it's nice to see that we get the same exact result when we actually call the function and so that's about it you all just wanted to quickly show you how to do a grainger causality test in python this code will as always be available in the description below and i'll see you next time

Original Description

Coding Granger Causality in Python! Granger Causality Theory Video: https://www.youtube.com/watch?v=b8hzDzGWyGM Link to Code: https://github.com/ritvikmath/Time-Series-Analysis/blob/master/Granger%20Causality.ipynb
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from ritvikmath · ritvikmath · 0 of 60

← Previous Next →
1 Math Team Update
Math Team Update
ritvikmath
2 Single Variable Calculus Volume of a Sphere - Proof 1
Single Variable Calculus Volume of a Sphere - Proof 1
ritvikmath
3 Single Variable Calculus Volume of a Sphere - Proof 2
Single Variable Calculus Volume of a Sphere - Proof 2
ritvikmath
4 Multivariable Calculus Volume of a Sphere Proof - Triple Integrals
Multivariable Calculus Volume of a Sphere Proof - Triple Integrals
ritvikmath
5 Multivariable Calculus Volume of a Sphere Proof - Double Integrals
Multivariable Calculus Volume of a Sphere Proof - Double Integrals
ritvikmath
6 The Euclidian Algorithm
The Euclidian Algorithm
ritvikmath
7 Proving the Chain Rule
Proving the Chain Rule
ritvikmath
8 Proving the Fundamental Theorem of Calculus Part 1
Proving the Fundamental Theorem of Calculus Part 1
ritvikmath
9 Proving the Fundamental Theorem of Calculus Part 2
Proving the Fundamental Theorem of Calculus Part 2
ritvikmath
10 Math Puzzle - Poison Perplexity
Math Puzzle - Poison Perplexity
ritvikmath
11 Math Puzzle - Poison Perplexity - Solution
Math Puzzle - Poison Perplexity - Solution
ritvikmath
12 Expected Value and Variance of Continuous Random Variables (Calculus)
Expected Value and Variance of Continuous Random Variables (Calculus)
ritvikmath
13 Expected Value and Variance of Discrete Random Variables (No Calculus)
Expected Value and Variance of Discrete Random Variables (No Calculus)
ritvikmath
14 Array Method
Array Method
ritvikmath
15 Complex Power Series and their Derivatives
Complex Power Series and their Derivatives
ritvikmath
16 Distributions - Intro
Distributions - Intro
ritvikmath
17 The Poisson Distribution
The Poisson Distribution
ritvikmath
18 The Bernoulli Distribution
The Bernoulli Distribution
ritvikmath
19 The Binomial Distribution
The Binomial Distribution
ritvikmath
20 The Continuous Uniform Distribution
The Continuous Uniform Distribution
ritvikmath
21 The Geometric Distribution
The Geometric Distribution
ritvikmath
22 The Triangular Distribution
The Triangular Distribution
ritvikmath
23 The Exponential Distribution
The Exponential Distribution
ritvikmath
24 The Borel Distribution + Notes on Poisson Distribution
The Borel Distribution + Notes on Poisson Distribution
ritvikmath
25 The Gamma Distribution
The Gamma Distribution
ritvikmath
26 The Normal Distribution
The Normal Distribution
ritvikmath
27 The Laplace Distribution
The Laplace Distribution
ritvikmath
28 The Chi - Squared Distribution
The Chi - Squared Distribution
ritvikmath
29 Overfitting
Overfitting
ritvikmath
30 Vector Norms
Vector Norms
ritvikmath
31 Truths Behind the Titanic : K-Nearest Neighbor
Truths Behind the Titanic : K-Nearest Neighbor
ritvikmath
32 The Mathematics of Breakups
The Mathematics of Breakups
ritvikmath
33 Sillyfish
Sillyfish
ritvikmath
34 Finding Optimal Paths - Dynamic Programming
Finding Optimal Paths - Dynamic Programming
ritvikmath
35 HowToDataScience : Scraping Twitter Data
HowToDataScience : Scraping Twitter Data
ritvikmath
36 Decision Trees
Decision Trees
ritvikmath
37 Perceptron
Perceptron
ritvikmath
38 Naive Bayes
Naive Bayes
ritvikmath
39 K-Nearest Neighbor
K-Nearest Neighbor
ritvikmath
40 Evaluating Machine Learning Models
Evaluating Machine Learning Models
ritvikmath
41 Decision Tree Pruning
Decision Tree Pruning
ritvikmath
42 K-Means Clustering
K-Means Clustering
ritvikmath
43 Gaussian Mixture Model
Gaussian Mixture Model
ritvikmath
44 Data Science - Fuzzy Record Matching
Data Science - Fuzzy Record Matching
ritvikmath
45 Time Series Talk : Autocorrelation and Partial Autocorrelation
Time Series Talk : Autocorrelation and Partial Autocorrelation
ritvikmath
46 Time Series Talk : Autoregressive Model
Time Series Talk : Autoregressive Model
ritvikmath
47 Time Series Talk : Moving Average Model
Time Series Talk : Moving Average Model
ritvikmath
48 Time Series Talk : ARMA Model
Time Series Talk : ARMA Model
ritvikmath
49 Time Series Talk : ARCH Model
Time Series Talk : ARCH Model
ritvikmath
50 Time Series Talk : White Noise
Time Series Talk : White Noise
ritvikmath
51 Time Series Talk : Stationarity
Time Series Talk : Stationarity
ritvikmath
52 Time Series Talk : ARIMA Model
Time Series Talk : ARIMA Model
ritvikmath
53 Time Series Talk : Lag Operator
Time Series Talk : Lag Operator
ritvikmath
54 Time Series Talk : What is Seasonality ?
Time Series Talk : What is Seasonality ?
ritvikmath
55 Time Series Talk : Seasonal ARIMA Model
Time Series Talk : Seasonal ARIMA Model
ritvikmath
56 So ... What Actually is a Matrix ? : Data Science Basics
So ... What Actually is a Matrix ? : Data Science Basics
ritvikmath
57 Derivative of a Matrix : Data Science Basics
Derivative of a Matrix : Data Science Basics
ritvikmath
58 Basics of PCA (Principal Component Analysis) : Data Science Concepts
Basics of PCA (Principal Component Analysis) : Data Science Concepts
ritvikmath
59 Eigenvalues & Eigenvectors : Data Science Basics
Eigenvalues & Eigenvectors : Data Science Basics
ritvikmath
60 The Covariance Matrix : Data Science Basics
The Covariance Matrix : Data Science Basics
ritvikmath

This video teaches how to perform a Granger Causality test in Python to determine if one time series causes another, using the statsmodels library and simulated data. The test is used to analyze the relationship between two time series and determine if one series is Granger caused by the other.

Key Takeaways
  1. Import necessary libraries
  2. Generate or load time series data
  3. Create a pandas DataFrame with the time series data
  4. Call the grangercausalitytests function
  5. Specify the number of lags to check
  6. Interpret the p-values to determine Granger causality
💡 The Granger Causality test can be used to determine if one time series causes another, and the number of lags to check can significantly impact the results.

Related AI Lessons

Up next
Learn Deep Learning by Hand (Beginner's Guide - Part 1)
Thu Vu
Watch →