MA Model Code Example : Time Series Talk

ritvikmath · Intermediate ·📐 ML Fundamentals ·6y ago

Key Takeaways

The video demonstrates how to code a Moving Average (MA) model for time series analysis using Python libraries such as statsmodels, numpy, and matplotlib. It covers generating a pure MA process, determining the order of the MA process using ACF and PACF, building the model, and making predictions.

Full Transcript

hey everyone in this video we'll be looking at the moving average model looking at some code and some data about how to actually get it in practice so this will be a pretty quick video because the setup will be similar to other videos a lot of the same functions will be used so the focus here is gonna be on how to create a moving average model and how to interpret the results of our model so the first thing we'll do which is different from previous videos is to generate our own data the reason being there's not a lot of pure moving average processes out in the wild if we look at some real data usually the moving average component is just one part with other parts like the autoregressive maybe some seasonality maybe some GARCH maybe some arch stuff but usually we're not looking at pure moving average but given this video is only about moving average I want to generate a pure moving average process for us so I'm gonna generate it according to this very simple model so I'm gonna say that the time series at any time step T so Y sub T is given by 50 which is just a constant plus 0.4 times epsilon sub T minus 1 so this is the innovation from the one period ago plus 0.3 times the innovation from two periods ago plus of course the innovation from this period so I'm saying innovation you can think of as error whatever you want to think about it is and each error or innovation is normally distributed with mean 0 standard deviation 1 so this is how the data will be generated and I'll quickly just walk through how I generated because I do think there's a little bit of value there so first I generate this series of errors I generate 400 of them just for safety I'm just going to end up using a couple of them but I have a lot just in case we want to scale it up in the future so I've decided to generate data from September 1st 2019 to January 1st 2020 so that's about 4 months of data and here's me actually generating the data so I set my muse 50 I empty my series here and then I just loop over the entire date index or the range of the day index and each time I append according to the formula above so mu plus 0.4 times the error one period ago plus 0.3 times the error two periods ago plus the error in the current period so that's me generating my series I go ahead and convert it to a pan Cirie's so it behaves a little bit nicer with the plots below and I make it infer the frequency so that it can be plotted correctly and here's me plotting the series so we see it starts in September 1st 2019 ending at Jan 1st 2020 and you can't tell a lot just by looking at it but it looks somewhat random and of course it's centered around its constant mean of 50 so now let's go ahead and generate our ACF and P ACF so here's our ACF if you notice I'm generating them slightly differently I'm not using the plot ACF and plot P ACF from these stats tools instead I'm just using ACF and P ACF because I want to plot them myself and generate the values in case I want to use those values later on for something else so these are two functions also from stats models TSA dot stat tools and they're called ACF NPA CF so when I do that I still get my ACF plot here we see that lag 0 is of course hundred percent like 1 is very strong like 2 is very strong and after that they kind of die down so this is already an indication remember we use ACF to tell us about the moving average part of an ARMA model so we use maybe a ma to model based on what I'm seeing here because the lag 2 is the strongest one before it kind of cuts off and goes too close to 0 if we look at P ACF so I've done many lags here we see P ACF has this characteristic pattern of alternating and maybe diminishing overtime so we see that more evidence that this is a moving average process so given the ACF and P ACF we say that okay we're going to start with a moving average to process and maybe go from there so in the same way as in the previous videos I generate my training and testing sets one very important note about the moving average process is that once you set the order of the process the order being too for us because we're gonna use two lags you can only predict that many periods in the future after that the prediction is going to just predict the constant mean of 50 so if you want to understand why that's true check out some of my theoretical videos but it is just a fact that if you're using a ma for model for example you can predict four periods in the future and no more after that it just gonna predict the mean for any future periods okay so that's why our training end is the 30th of December and our testing end is two days later on the 1st of January so we're only gonna predict two periods here now here's us fitting the model I use the ARIMA function here where I use the ARMA function in the previous video it's from the same library from stats models TSA TSA being time series analysis ARIMA model so instead of ARMA I imported ARIMA it works very much the same way it just leaves us some more flexibility for this I term when we get to that in a future video we won't be needing that item here here simple model is just AR order 0 I order 0 and MA or Core 2 so this is a pure MA to process so we go ahead and create the data so we go ahead and create the model on the training data we fit it using this function and we print the summary so here's something I want to spend a little bit of time dissecting so we see that the first thing we should look at is that this is MA to process so our ma 0 2 as we create it the next thing we'll look at is these coefficients so we see that there's a constant which is predicted very close to the true constant of 50 so it got that pretty pretty close and then the coefficients of the MA lag 1 term and the MA lag 2 term so the MA lag one term was predicted to be 0.37 about whereas in reality it's 0.4 so pretty close the MA two term in reality is 0.3 and it was predicted as point 25 so again pretty close so not exact in either case but it's getting pretty close to the true generating process behind this data and we see for both of them the P values so this P greater than Z are very low so they are significant which means we should keep them in our model so so far we basically said that okay our prediction of using an MA to process is somewhat found it because we get two terms which are very close to their true values and they're both significant okay so our pretty good model looks like this if we want to predict any future y sub T it's going to be 50 so truly it's 50.0 something but I just called it 50/50 plus point 37 times epsilon sub t minus 1 plus point 25 times epsilon t minus 2 plus the error in the current period actually I should get rid of that right because we can't actually use that in our predictions that's catch on the fly so if we want to predict something we use this model here and the last thing we'll do here is predicting those two periods in advance so here's our testing set and this orange line is our prediction so we see it's pretty close it's not exact of course but it's pretty close to the true value of the data right there which is in blue and if we want to calculate some metrics on our errors we see that the mean absolute percent error is point zero zero five three we see that the root mean squared error is about one point eight so this prediction is OK on average and I do want to emphasize again that I only predicted two periods in advance so the first one being this and the second one being this because if I predicted any more it just going to predict the mean value of 50 that's something that's characteristic about a and they process okay so this is in a nutshell how to generate data according to a given M a process moving average process how to look at the ACF and P ACF and make some general ideas about what order we should set how to generate some training and testing sets and how to fit the model interpret the results and make some predictions based on that model so next time we'll put the AR and ma together and look at a full ARMA model alright so until next time

Original Description

Coding the MA Model: - Generate your own MA process - Use ACF and PACF to determine order of MA process - Build the model - Make predictions Code used in this video: https://github.com/ritvikmath/Time-Series-Analysis/blob/master/MA%20Model.ipynb
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from ritvikmath · ritvikmath · 0 of 60

← Previous Next →
1 Math Team Update
Math Team Update
ritvikmath
2 Single Variable Calculus Volume of a Sphere - Proof 1
Single Variable Calculus Volume of a Sphere - Proof 1
ritvikmath
3 Single Variable Calculus Volume of a Sphere - Proof 2
Single Variable Calculus Volume of a Sphere - Proof 2
ritvikmath
4 Multivariable Calculus Volume of a Sphere Proof - Triple Integrals
Multivariable Calculus Volume of a Sphere Proof - Triple Integrals
ritvikmath
5 Multivariable Calculus Volume of a Sphere Proof - Double Integrals
Multivariable Calculus Volume of a Sphere Proof - Double Integrals
ritvikmath
6 The Euclidian Algorithm
The Euclidian Algorithm
ritvikmath
7 Proving the Chain Rule
Proving the Chain Rule
ritvikmath
8 Proving the Fundamental Theorem of Calculus Part 1
Proving the Fundamental Theorem of Calculus Part 1
ritvikmath
9 Proving the Fundamental Theorem of Calculus Part 2
Proving the Fundamental Theorem of Calculus Part 2
ritvikmath
10 Math Puzzle - Poison Perplexity
Math Puzzle - Poison Perplexity
ritvikmath
11 Math Puzzle - Poison Perplexity - Solution
Math Puzzle - Poison Perplexity - Solution
ritvikmath
12 Expected Value and Variance of Continuous Random Variables (Calculus)
Expected Value and Variance of Continuous Random Variables (Calculus)
ritvikmath
13 Expected Value and Variance of Discrete Random Variables (No Calculus)
Expected Value and Variance of Discrete Random Variables (No Calculus)
ritvikmath
14 Array Method
Array Method
ritvikmath
15 Complex Power Series and their Derivatives
Complex Power Series and their Derivatives
ritvikmath
16 Distributions - Intro
Distributions - Intro
ritvikmath
17 The Poisson Distribution
The Poisson Distribution
ritvikmath
18 The Bernoulli Distribution
The Bernoulli Distribution
ritvikmath
19 The Binomial Distribution
The Binomial Distribution
ritvikmath
20 The Continuous Uniform Distribution
The Continuous Uniform Distribution
ritvikmath
21 The Geometric Distribution
The Geometric Distribution
ritvikmath
22 The Triangular Distribution
The Triangular Distribution
ritvikmath
23 The Exponential Distribution
The Exponential Distribution
ritvikmath
24 The Borel Distribution + Notes on Poisson Distribution
The Borel Distribution + Notes on Poisson Distribution
ritvikmath
25 The Gamma Distribution
The Gamma Distribution
ritvikmath
26 The Normal Distribution
The Normal Distribution
ritvikmath
27 The Laplace Distribution
The Laplace Distribution
ritvikmath
28 The Chi - Squared Distribution
The Chi - Squared Distribution
ritvikmath
29 Overfitting
Overfitting
ritvikmath
30 Vector Norms
Vector Norms
ritvikmath
31 Truths Behind the Titanic : K-Nearest Neighbor
Truths Behind the Titanic : K-Nearest Neighbor
ritvikmath
32 The Mathematics of Breakups
The Mathematics of Breakups
ritvikmath
33 Sillyfish
Sillyfish
ritvikmath
34 Finding Optimal Paths - Dynamic Programming
Finding Optimal Paths - Dynamic Programming
ritvikmath
35 HowToDataScience : Scraping Twitter Data
HowToDataScience : Scraping Twitter Data
ritvikmath
36 Decision Trees
Decision Trees
ritvikmath
37 Perceptron
Perceptron
ritvikmath
38 Naive Bayes
Naive Bayes
ritvikmath
39 K-Nearest Neighbor
K-Nearest Neighbor
ritvikmath
40 Evaluating Machine Learning Models
Evaluating Machine Learning Models
ritvikmath
41 Decision Tree Pruning
Decision Tree Pruning
ritvikmath
42 K-Means Clustering
K-Means Clustering
ritvikmath
43 Gaussian Mixture Model
Gaussian Mixture Model
ritvikmath
44 Data Science - Fuzzy Record Matching
Data Science - Fuzzy Record Matching
ritvikmath
45 Time Series Talk : Autocorrelation and Partial Autocorrelation
Time Series Talk : Autocorrelation and Partial Autocorrelation
ritvikmath
46 Time Series Talk : Autoregressive Model
Time Series Talk : Autoregressive Model
ritvikmath
47 Time Series Talk : Moving Average Model
Time Series Talk : Moving Average Model
ritvikmath
48 Time Series Talk : ARMA Model
Time Series Talk : ARMA Model
ritvikmath
49 Time Series Talk : ARCH Model
Time Series Talk : ARCH Model
ritvikmath
50 Time Series Talk : White Noise
Time Series Talk : White Noise
ritvikmath
51 Time Series Talk : Stationarity
Time Series Talk : Stationarity
ritvikmath
52 Time Series Talk : ARIMA Model
Time Series Talk : ARIMA Model
ritvikmath
53 Time Series Talk : Lag Operator
Time Series Talk : Lag Operator
ritvikmath
54 Time Series Talk : What is Seasonality ?
Time Series Talk : What is Seasonality ?
ritvikmath
55 Time Series Talk : Seasonal ARIMA Model
Time Series Talk : Seasonal ARIMA Model
ritvikmath
56 So ... What Actually is a Matrix ? : Data Science Basics
So ... What Actually is a Matrix ? : Data Science Basics
ritvikmath
57 Derivative of a Matrix : Data Science Basics
Derivative of a Matrix : Data Science Basics
ritvikmath
58 Basics of PCA (Principal Component Analysis) : Data Science Concepts
Basics of PCA (Principal Component Analysis) : Data Science Concepts
ritvikmath
59 Eigenvalues & Eigenvectors : Data Science Basics
Eigenvalues & Eigenvectors : Data Science Basics
ritvikmath
60 The Covariance Matrix : Data Science Basics
The Covariance Matrix : Data Science Basics
ritvikmath

This video teaches how to implement a Moving Average (MA) model for time series analysis using Python. It covers the basics of MA models, how to determine the order of the model, and how to make predictions. The video provides a comprehensive overview of the MA model and its implementation.

Key Takeaways
  1. Generate a series of errors ε_T
  2. Generate the time series Y_T using the moving average model
  3. Plot the time series
  4. Calculate the ACF and PACF of the time series
  5. Determine the order of the moving average model based on the ACF and PACF
  6. Fit ARIMA model using ARIMA function
  7. Print summary of model
  8. Predict two periods in advance using MA(2) process model
  9. Calculate mean absolute percent error and root mean squared error
💡 The video highlights the importance of determining the order of the MA process using ACF and PACF, and how to implement the MA model using the ARIMA function from the statsmodels library.

Related AI Lessons

Up next
Learn Deep Learning by Hand (Beginner's Guide - Part 1)
Thu Vu
Watch →