Get startedGet started for free

State space models for exponential smoothing

1. State space models for exponential smoothing

All of the exponential smoothing methods can be written

2. Innovations state space models

in the form of innovations state space models. Remember we had 3 possible trends (none, additive or damped)

3. Innovations state space models

and 3 possible seasonal components (none, additive or multiplicative),

4. Innovations state space models

giving 9 possible exponential smoothing methods. Each of these can be written as state space models in two different ways,

5. Innovations state space models

one with additive errors and one with multiplicative errors.

6. Innovations state space models

So there are 18 possible state space models. Multiplicative errors means that the noise increases with the level of the series, just as multiplicative seasonality means that the seasonal fluctuations increase with the level of the series. These are known as ETS models, which stands for Error,Trend,Seasonal models. It is also deliberately reminiscent of Exponential Smoothing models.

7. ETS models

The advantage of thinking in this way is that we can then use maximum likelihood estimation to optimize the parameters, and you have a way of generating prediction intervals for all models. Most importantly, you have a way of selecting the best model for a particular time series. So rather than looking at graphs and guessing what might work in each case, you can automatically select an exponential smoothing state space model to use for each time series. You can do this by minimizing Akaike's Information Criterion, named after Japanese statistician Hirotugu Akaike. You'll use a bias-corrected version known as the AICc. This is roughly the same as using time series cross-validation, especially on long time series, but it's much faster.

8. Example: Australian air traffic

The ets function does all the work for us. Just give it a time series, and it comes back with the best model found by minimizing the AICc. In this case, our time series is ausair - the annual number of passengers on Australian airlines. The best model is an ETS(M,A,N) model. That is, it has multiplicative errors, additive trend and no seasonality. The parameters are estimated in much the same way as when you used the holt function, except it maximizes the likelihood rather than minimizing the sum of squared errors. Apart from the way the parameters are chosen, this model is equivalent to using Holt's linear method. What is different is that ets does not compute the forecasts for you. It returns a model. To produce forecasts, you need to pass that model to the forecast function.

9. Example: Australian air traffic

Here we take the data, pass it to the ets function, pass that result to the forecast function, and finally plot the result. The linear trend is clearly seen. The multiplicative errors means that the width of the prediction intervals grows more quickly than if an additive error model had been chosen.

10. Example: Monthly cortecosteroid drug sales

Let's look at a seasonal example. The h02 data contains monthly sales of cortecosteroid drugs in Australia. In this case, the ets function has selected a model with multiplicative errors, damped trend, and multiplicative seasonality. There are a lot of parameters to estimate - four smoothing parameters, an initial level, an initial slope and 11 initial seasonal values. The 12th seasonal value is calculated to ensure the seasonal component sums to one.

11. Example: Monthly cortecosteroid drug sales

Forecasts from the resulting model look pretty good. The seasonal pattern and low trend have been well captured. The advantage of using the ets function is that the type of model is chosen for you. So this is a completely automatic method of forecasting. It is a very convenient way to forecast time series that have trend or seasonality.

12. Let's practice!

Now it's your turn to try it out.