Fitted values and residuals

1. Fitted values and residuals

One way to check if our forecasting method is any good is to try to forecast what we have already seen.

2. Fitted values and residuals

One-step-ahead forecasts of the data already seen are called "fitted values". That is, each fitted value is a forecast based on all data up to and including the previous observation. If our forecasting method has involved estimating any parameters, these parameters are usually computed using all available data. So forecasts computed on the same data are not really forecasts, as we have cheated by already using the data when building our forecasting model. Nevertheless, they can be useful. A residual is the difference between an observation and its fitted value. This is the bit left over that we did not predict. If our forecasting method is good, the residuals should look like white noise.

3. Example: oil production

The data plotted here is the annual oil production in Saudi Arabia. We could try to forecast this using the naive method. The fitted values from the naive method are also shown on the plot. Remember that the naive method simply uses the most recent observation as the forecast for future observations. Hence the green line is just the same as the red line, but shifted by one year. The residuals are the vertical distances between these two lines.

4. Example: oil production

Here the residuals are plotted. We are hoping that these look like white noise, because that would mean that the forecast method we used captured all the available information in the data.

5. Residuals should look like white noise

There are four basic assumptions we will make about the residuals, two are essential and two are just convenient. The first essential assumption is that residuals should be uncorrelated. Otherwise there is information in the residuals that should have been captured by the forecasting method. Next, we assume that the residuals have zero mean. If that wasn't true, the forecast would be biased, and we could easily fix the problem by adjusting the forecasts until the residuals have zero mean. So this is a trivial requirement. The next two assumptions are not actually essential, they just make life easier. We assume that the residuals have constant variance, and we assume that they are normally distributed. These two assumptions are used in computing the prediction intervals. Earlier I said that residuals should look like white noise. White noise would satisfy assumptions 1, 2 and 3. So we are actually asking for something slightly more from our residuals - they should look like Gaussian white noise. There is a very convenient function available for you to check these assumptions. Just pass the forecast object to the checkresiduals function. It will produce a time plot, an ACF plot, a histogram, and do a Ljung-Box test on the residuals.

6. checkresiduals()

In this case, the Ljung-Box test has a p-value well above the 0-point-05 threshold, so there is no problem with the autocorrelations - they look like what you would expect from white noise. That is confirmed by the ACF plot. The histogram looks pretty close to a normal curve, although there is possibly one outlier on the negative side. Make a habit of always checking your residuals before proceeding to produce the forecasts. If you do end up with some non-normality, or some autocorrelation in your residuals, don't despair. The point forecasts may still be good and can be used. It is the prediction intervals that might be either too wide or too narrow, and should not be taken too seriously.

7. Let's practice!

Let's try these ideas out on some other time series in the next exercise.

This exercise is part of the course

Forecasting in R

IntermediateSkill Level

4.8+

Start Course for Free

The first thing to do in any data analysis task is to plot the data. Graphs enable many features of the data to be visualized, including patterns, unusual observations, and changes over time. The features that are seen in plots of the data must then be incorporated, as far as possible, into the forecasting methods to be used.

Exercise 1: Welcome to Forecasting Using R Exercise 2: Creating time series objects in R Exercise 3: Time series plots Exercise 4: Seasonal plots Exercise 5: Trends, seasonality, and cyclicity Exercise 6: Autocorrelation of non-seasonal time series Exercise 7: Autocorrelation of seasonal and cyclic time series Exercise 8: Match the ACF to the time series Exercise 9: White noise Exercise 10: Stock prices and white noise

In this chapter, you will learn general tools that are useful for many different forecasting situations. It will describe some methods for benchmark forecasting, methods for checking whether a forecasting method has adequately utilized the available information, and methods for measuring forecast accuracy. Each of the tools discussed in this chapter will be used repeatedly in subsequent chapters as you develop and explore a range of forecasting methods.

Exercise 1: Forecasts and potential futures Exercise 2: Naive forecasting methods Exercise 3: Fitted values and residuals

Current Exercise

Exercise 4: Checking time series residuals Exercise 5: Training and test sets Exercise 6: Evaluating forecast accuracy of non-seasonal methods Exercise 7: Evaluating forecast accuracy of seasonal methods Exercise 8: Do I have a good forecasting model?Exercise 9: Time series cross-validation Exercise 10: Using tsCV() for time series cross-validation

Forecasts produced using exponential smoothing methods are weighted averages of past observations, with the weights decaying exponentially as the observations get older. In other words, the more recent the observation, the higher the associated weight. This framework generates reliable forecasts quickly and for a wide range of time series, which is a great advantage and of major importance to applications in business.

Exercise 1: Exponentially weighted forecasts Exercise 2: Simple exponential smoothing Exercise 3: SES vs naive Exercise 4: Exponential smoothing methods with trend Exercise 5: Holt's trend methods Exercise 6: Exponential smoothing methods with trend and seasonality Exercise 7: Holt-Winters with monthly data Exercise 8: Holt-Winters method with daily data Exercise 9: State space models for exponential smoothing Exercise 10: Automatic forecasting with exponential smoothing Exercise 11: ETS vs seasonal naive Exercise 12: Match the models to the time series Exercise 13: When does ETS fail?

ARIMA models provide another approach to time series forecasting. Exponential smoothing and ARIMA models are the two most widely-used approaches to time series forecasting, and provide complementary approaches to the problem. While exponential smoothing models are based on a description of the trend and seasonality in the data, ARIMA models aim to describe the autocorrelations in the data.

Exercise 1: Transformations for variance stabilization Exercise 2: Box-Cox transformations for time series Exercise 3: Non-seasonal differencing for stationarity Exercise 4: Seasonal differencing for stationarity Exercise 5: ARIMA models Exercise 6: Automatic ARIMA models for non-seasonal time series Exercise 7: Forecasting with ARIMA models Exercise 8: Comparing auto.arima() and ets() on non-seasonal data Exercise 9: Seasonal ARIMA models Exercise 10: Automatic ARIMA models for seasonal time series Exercise 11: Exploring auto.arima() options Exercise 12: Comparing auto.arima() and ets() on seasonal data

The time series models in the previous chapters work well for many time series, but they are often not good for weekly or hourly data, and they do not allow for the inclusion of other information such as the effects of holidays, competitor activity, changes in the law, etc. In this chapter, you will look at some methods that handle more complicated seasonality, and you consider how to extend ARIMA models in order to allow other information to be included in the them.

Exercise 1: Dynamic regression Exercise 2: Forecasting sales allowing for advertising expenditure Exercise 3: Forecasting electricity demand Exercise 4: Dynamic harmonic regression Exercise 5: Forecasting weekly data Exercise 6: Harmonic regression for multiple seasonality Exercise 7: Forecasting call bookings Exercise 8: TBATS models Exercise 9: TBATS models for electricity demand Exercise 10: Your future in forecasting!