Comparing auto.arima() and ets() on seasonal data
What happens when you want to create training and test sets for data that is more frequent than yearly? If needed, you can use a vector in form c(year, period)
for the start
and/or end
keywords in the window()
function. You must also ensure that you're using the appropriate values of h
in forecasting functions. Recall that h
should be equal to the length of the data that makes up your test set.
For example, if your data spans 15 years, your training set consists of the first 10 years, and you intend to forecast the last 5 years of data, you would use h = 12 * 5
not h = 5
because your test set would include 60 monthly observations. If instead your training set consists of the first 9.5 years and you want forecast the last 5.5 years, you would use h = 66
to account for the extra 6 months.
In the final exercise for this chapter, you will compare seasonal ARIMA and ETS models applied to the quarterly cement production data qcement
. Because the series is very long, you can afford to use a training and test set rather than time series cross-validation. This is much faster.
The qcement
data is available to use in your workspace.
This exercise is part of the course
Forecasting in R
Exercise instructions
- Create a training set called
train
consisting of 20 years ofqcement
data beginning in the year 1988 and ending at the last quarter of 2007; you must use a vector forend
. The remaining data is your test set. - Fit ARIMA and ETS models to the training data and save these to
fit1
andfit2
, respectively. - Just as you have done with previous exercises, check that both models have white noise residuals.
- Produce forecasts for the remaining data from both models as
fc1
andfc2
, respectively. Seth
to the number of total quarters in your test set. Be careful- the last observation inqcement
is not the final quarter of the year! - Using the
accuracy()
function, find the better model based on the RMSE value, and save it asbettermodel
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Use 20 years of the qcement data beginning in 1988
train <- window(___, start = ___, end = ___)
# Fit an ARIMA and an ETS model to the training data
fit1 <- ___
fit2 <- ___
# Check that both models have white noise residuals
___
___
# Produce forecasts for each model
fc1 <- forecast(___, h = ___)
fc2 <- forecast(___, h = ___)
# Use accuracy() to find better model based on RMSE
accuracy(___, ___)
accuracy(___, ___)
bettermodel <- ___