Moving Average Model

Describing a Moving Average Model

  • A moving average model is linear regression model, where the response variable represents some value indexed by the current time period and the predictor variables represent our mean of the value and some white noise error term
  • The moving-average model should not be confused with the simple moving average, which takes the arithmetic mean of a given set of prices over the past number of days
  • Specifically, we can define a moving average model of order rr as the following:

    st=μ+ϵt+θ1ϵt1+θrϵtrs_{t} = \mu + \epsilon_{t} + \theta_{1}\epsilon_{t-1} + \theta_{r}\epsilon_{t-r}
    • Where sts_{t} is the unknown true value in our current time period
    • Where μ\mu is the mean of our values (this is constant for any tt)
    • Where ϵt\epsilon_{t} is the unknown error of the current predicted value and what we will observe
    • Where ϵt1\epsilon_{t-1} is the error of the previous predicted value and what we observed
    • Where θ1\theta_{1} represents the percentage of the error ϵt1\epsilon_{t-1} we should include in our model
    • Where ϵtr\epsilon_{t-r} is the error of the rthr^{th} previous predicted value and what we observed
    • Where θr\theta_{r} represents the percentage of the error ϵtr\epsilon_{t-r} we should include in our model
  • We can estimate sts_{t} by using the following equation:

    st^=μ+θ1ϵt1+θrϵtr\hat{s_{t}} = \mu + \theta_{1}\epsilon_{t-1} + \theta_{r}\epsilon_{t-r}
    • Where st^\hat{s_{t}} is the predicted value in our current time period

An Example of a Moving Average Model of Order 1

  • Let's say we're predicting the price of salmon each month using a moving average model of only the the previous month's data
  • We can define our model as the following:

    st^=μ+θ1ϵt1\hat{s_{t}} = \mu + \theta_{1}\epsilon_{t-1} where ϵt1=s^t1st1\text{where } \epsilon_{t-1} = \hat{s}_{t-1} - s_{t-1}
    • Where s^t\hat{s}_{t} is our current prediction
    • Where s^t1\hat{s}_{t-1} is our previous prediction
    • Where μ\mu is the mean of our values
    • Where ϵt1\epsilon_{t-1} is the difference betweeen our previous prediction and the previous observed value
    • Where θ1\theta_{1} represents the percentage of the previous error ϵt1\epsilon_{t-1} we should include in our model
  • The table below shows our observed data and predictions of a few iterations
month tt st^\hat{s_{t}} ϵt\epsilon_{t} sts_{t} μ\mu θ1\theta_{1} θ1ϵt\theta_{1}\epsilon_{t}
Jan 1 10 -2 8 10 0.5 -1
Feb 2 9 1 10 10 0.5 0.5
March 3 10.5 0 10.5 10 0.5 0
April 4 10 2 12 10 0.5 1
May 5 11 1 12 10 0.5 0.5
  • We can interpret the second iteration as the following:

    • Our predicted price of salmon in February s2^\hat{s_{2}} is 99
    • The error of our predicted price of salmon in January ϵ1\epsilon_{1} is 2-2
    • The actual price of salmon in February s2^\hat{s_{2}} is 1010
    • The average price of salmon μ\mu is 1010
    • The percentage of the previous error we wanted to include θ1\theta_{1} is 0.50.5

An Example of a Moving Average Model of Order 2

  • Now, let's say we want to use the two previous month's data to predict the price of salmon using a moving average model
  • We can define our model as the following:
st^=μ+θ1ϵt1+θ2ϵt2\hat{s_{t}} = \mu + \theta_{1}\epsilon_{t-1} + \theta_{2}\epsilon_{t-2}
  • The table below shows our observed data and predictions of a few iterations
month tt st^\hat{s_{t}} ϵt\epsilon_{t} sts_{t} μ\mu θ1\theta_{1} θ1ϵt\theta_{1}\epsilon_{t}
Jan 1 10 -2 8 10 0.5 -1
Feb 2 9 1 10 10 0.5 0.5
March 3 9.5 1 10.5 10 0.5 0.5
April 4 11 1 12 10 0.5 0.5
May 5 12 0 12 10 0.5 0
  • We can interpret the third iteration as the following:

    • Our predicted price of salmon in March s3^\hat{s_{3}} is 9.59.5
    • The error of our predicted price of salmon in February ϵ2\epsilon_{2} is 11
    • The error of our predicted price of salmon in January ϵ1\epsilon_{1} is 2-2
    • The actual price of salmon in March s3s_{3} is 10.510.5
    • The average price of salmon μ\mu is 1010
    • The percentage of the error from February's prediction that we wanted to include θ1\theta_{1} is 0.50.5
    • The percentage of the error from January's prediction that we wanted to include θ2\theta_{2} is 0.50.5

Determining the Order Parameter

  • The moving average model is parameterized by an order qq, which refers to the number of lags to account for in the prediction
  • Similar to an autoregressive model, including every single lag variable (or a very large amount of lag variables) is a typical naive approach to fitting a moving average model
  • This approach typically leads to overfitting
  • Therefore, we are interested in choosing the smallest order qq for our model that will include only the significant lags
  • This will help us avoid overfitting and build a model that will hold up better over time
  • We can determine which lags are most significant by observing the lags within an autocorrelated function (or acfacf) chart
  • Specifically, we want to know what order includes only the lags that are most indirectly or directly correlated with the price of salmon of our current month
  • Essentially, we only want to include the lags in our model whose direct or indirect effects (based on acfacf) are high in magnitude according to the acfacf chart

References

Previous
Next

Autoregression

ARMA Model