Get startedGet started for free

Exponentially weighted forecasts

1. Exponentially weighted forecasts

Two very simple forecasting methods are the naive method and the mean method. The naive method use only the most recent observation as the forecast for all future periods, while the mean method uses the average of all observations as the forecast for all future periods. Something between these two extremes would be useful - a forecast based on all observations, but where the most recent observations are more heavily weighted.

2. Simple exponential smoothing

This is the idea of exponentially weighted forecasts, commonly known as "Simple Exponential Smoothing". Before discussing the forecast equation, I need to introduce some new notation. We shall use y-hat to mean a point forecast, where the subscript tells us what period we are forecasting, and how far ahead we are forecasting. The vertical line in the subscript means "conditional on" or "given". So y-hat with subscript t+h|t means we are forecasting h steps ahead given data up to time t. Now this equation describes an exponentially weighted forecast. That is, a weighted average of all the data up to time t where the weights decrease exponentially as we go back in time.

3. Simple exponential smoothing

Here, the alpha parameter determines how much weight is placed on the most recent observation, and how quickly the weights decay away. Larger alpha means more weight is placed on the most recent observation, and the weights decay away very quickly. A small value of alpha means a smaller weight is placed on the most recent observations, and the weights decay away more slowly.

4. Simple exponential smoothing

An equivalent and more convenient way of writing the forecast equation uses this component form. It might not be obvious to you that this gives the same result, but it does. We think of l_t as the unobserved level component. The forecasts are equal to the most recent estimate of the level. The level itself evolves over time based on the most recent observation and the previous estimate of the level. When written like this, it is clear that we need to estimate two parameters: alpha and the initial level component, l_0. In regression, we estimate parameters by minimizing the sum of squared errors. We can do exactly the same here. However, unlike regression, there is no nice formula that gives the optimal parameters. Instead, we have to use a non-linear optimization routine.

5. Example: oil production

Fortunately, R handles all this for you with the ses function. In this example, we consider data on annual oil production in Saudi Arabia from 1996. The ses function has estimated an alpha value of 0-point-83 which is quite high. It means that 83% of the forecast is based on the most recent observation, 14% from the observation before that, and the remaining 3% on the earlier observations. The initial level ell_0 is estimated to be about 447. We have set h equals 5, so forecasts for the next five years are computed and stored in the object returned by ses.

6. Example: oil production

Here is a plot of the forecasts. Remember, simple exponential smoothing has the same value for all forecasts - it is the estimated mean of the future possible sample paths. Because alpha is quite high, the value of the forecasts are close to the most last observation. As the name suggests, simple exponential smoothing is a very simple method. But it forms the starting point for more complicated methods in the exponential smoothing family - methods that will handle trends and seasonality. You will consider these later in this chapter.

7. Let's practice!

Let's try using the ses function in the next exercise.