Exercise

Comparing auto.arima() and ets() on seasonal data

What happens when you want to create training and test sets for data that is more frequent than yearly? If needed, you can use a vector in form c(year, period) for the start and/or end keywords in the window() function. You must also ensure that you're using the appropriate values of h in forecasting functions. Recall that h should be equal to the length of the data that makes up your test set.

For example, if your data spans 15 years, your training set consists of the first 10 years, and you intend to forecast the last 5 years of data, you would use h = 12 * 5 not h = 5 because your test set would include 60 monthly observations. If instead your training set consists of the first 9.5 years and you want forecast the last 5.5 years, you would use h = 66 to account for the extra 6 months.

In the final exercise for this chapter, you will compare seasonal ARIMA and ETS models applied to the quarterly cement production data qcement. Because the series is very long, you can afford to use a training and test set rather than time series cross-validation. This is much faster.

The qcement data is available to use in your workspace.

Instructions

100 XP
  • Create a training set called train consisting of 20 years of qcement data beginning in the year 1988 and ending at the last quarter of 2007; you must use a vector for end. The remaining data is your test set.
  • Fit ARIMA and ETS models to the training data and save these to fit1 and fit2, respectively.
  • Just as you have done with previous exercises, check that both models have white noise residuals.
  • Produce forecasts for the remaining data from both models as fc1 and fc2, respectively. Set h to the number of total quarters in your test set. Be careful- the last observation in qcement is not the final quarter of the year!
  • Using the accuracy() function, find the better model based on the RMSE value, and save it as bettermodel.