Fitted values and residuals

1. Fitted values and residuals

One way to check if our forecasting method is any good is to try to forecast what we have already seen.

2. Fitted values and residuals

One-step-ahead forecasts of the data already seen are called "fitted values". That is, each fitted value is a forecast based on all data up to and including the previous observation. If our forecasting method has involved estimating any parameters, these parameters are usually computed using all available data. So forecasts computed on the same data are not really forecasts, as we have cheated by already using the data when building our forecasting model. Nevertheless, they can be useful. A residual is the difference between an observation and its fitted value. This is the bit left over that we did not predict. If our forecasting method is good, the residuals should look like white noise.

3. Example: oil production

The data plotted here is the annual oil production in Saudi Arabia. We could try to forecast this using the naive method. The fitted values from the naive method are also shown on the plot. Remember that the naive method simply uses the most recent observation as the forecast for future observations. Hence the green line is just the same as the red line, but shifted by one year. The residuals are the vertical distances between these two lines.

4. Example: oil production

Here the residuals are plotted. We are hoping that these look like white noise, because that would mean that the forecast method we used captured all the available information in the data.

5. Residuals should look like white noise

There are four basic assumptions we will make about the residuals, two are essential and two are just convenient. The first essential assumption is that residuals should be uncorrelated. Otherwise there is information in the residuals that should have been captured by the forecasting method. Next, we assume that the residuals have zero mean. If that wasn't true, the forecast would be biased, and we could easily fix the problem by adjusting the forecasts until the residuals have zero mean. So this is a trivial requirement. The next two assumptions are not actually essential, they just make life easier. We assume that the residuals have constant variance, and we assume that they are normally distributed. These two assumptions are used in computing the prediction intervals. Earlier I said that residuals should look like white noise. White noise would satisfy assumptions 1, 2 and 3. So we are actually asking for something slightly more from our residuals - they should look like Gaussian white noise. There is a very convenient function available for you to check these assumptions. Just pass the forecast object to the checkresiduals function. It will produce a time plot, an ACF plot, a histogram, and do a Ljung-Box test on the residuals.

6. checkresiduals()

In this case, the Ljung-Box test has a p-value well above the 0-point-05 threshold, so there is no problem with the autocorrelations - they look like what you would expect from white noise. That is confirmed by the ACF plot. The histogram looks pretty close to a normal curve, although there is possibly one outlier on the negative side. Make a habit of always checking your residuals before proceeding to produce the forecasts. If you do end up with some non-normality, or some autocorrelation in your residuals, don't despair. The point forecasts may still be good and can be used. It is the prediction intervals that might be either too wide or too narrow, and should not be taken too seriously.

7. Let's practice!

Let's try these ideas out on some other time series in the next exercise.