Describing a Moving Average Model
- A moving average model is linear regression model, where the response variable represents some value indexed by the current time period and the predictor variables represent our mean of the value and some white noise error term
- The moving-average model should not be confused with the simple moving average, which takes the arithmetic mean of a given set of prices over the past number of days
-
Specifically, we can define a moving average model of order as the following:
- Where is the unknown true value in our current time period
- Where is the mean of our values (this is constant for any )
- Where is the unknown error of the current predicted value and what we will observe
- Where is the error of the previous predicted value and what we observed
- Where represents the percentage of the error we should include in our model
- Where is the error of the previous predicted value and what we observed
- Where represents the percentage of the error we should include in our model
-
We can estimate by using the following equation:
- Where is the predicted value in our current time period
An Example of a Moving Average Model of Order 1
- Let's say we're predicting the price of salmon each month using a moving average model of only the the previous month's data
-
We can define our model as the following:
- Where is our current prediction
- Where is our previous prediction
- Where is the mean of our values
- Where is the difference betweeen our previous prediction and the previous observed value
- Where represents the percentage of the previous error we should include in our model
- The table below shows our observed data and predictions of a few iterations
month | |||||||
---|---|---|---|---|---|---|---|
Jan | 1 | 10 | -2 | 8 | 10 | 0.5 | -1 |
Feb | 2 | 9 | 1 | 10 | 10 | 0.5 | 0.5 |
March | 3 | 10.5 | 0 | 10.5 | 10 | 0.5 | 0 |
April | 4 | 10 | 2 | 12 | 10 | 0.5 | 1 |
May | 5 | 11 | 1 | 12 | 10 | 0.5 | 0.5 |
-
We can interpret the second iteration as the following:
- Our predicted price of salmon in February is
- The error of our predicted price of salmon in January is
- The actual price of salmon in February is
- The average price of salmon is
- The percentage of the previous error we wanted to include is
An Example of a Moving Average Model of Order 2
- Now, let's say we want to use the two previous month's data to predict the price of salmon using a moving average model
- We can define our model as the following:
- The table below shows our observed data and predictions of a few iterations
month | |||||||
---|---|---|---|---|---|---|---|
Jan | 1 | 10 | -2 | 8 | 10 | 0.5 | -1 |
Feb | 2 | 9 | 1 | 10 | 10 | 0.5 | 0.5 |
March | 3 | 9.5 | 1 | 10.5 | 10 | 0.5 | 0.5 |
April | 4 | 11 | 1 | 12 | 10 | 0.5 | 0.5 |
May | 5 | 12 | 0 | 12 | 10 | 0.5 | 0 |
-
We can interpret the third iteration as the following:
- Our predicted price of salmon in March is
- The error of our predicted price of salmon in February is
- The error of our predicted price of salmon in January is
- The actual price of salmon in March is
- The average price of salmon is
- The percentage of the error from February's prediction that we wanted to include is
- The percentage of the error from January's prediction that we wanted to include is
Determining the Order Parameter
- The moving average model is parameterized by an order , which refers to the number of lags to account for in the prediction
- Similar to an autoregressive model, including every single lag variable (or a very large amount of lag variables) is a typical naive approach to fitting a moving average model
- This approach typically leads to overfitting
- Therefore, we are interested in choosing the smallest order for our model that will include only the significant lags
- This will help us avoid overfitting and build a model that will hold up better over time
- We can determine which lags are most significant by observing the lags within an autocorrelated function (or ) chart
- Specifically, we want to know what order includes only the lags that are most indirectly or directly correlated with the price of salmon of our current month
- Essentially, we only want to include the lags in our model whose direct or indirect effects (based on ) are high in magnitude according to the chart
References
Previous
Next