1. Automation and saving
Now that you know your way around the SARIMA model you're ready to learn about a powerful tool to search over model orders.
By now you are also able to train good models, so it's time to cover how to save those models and update them later.
2. Searching over model orders
Previously we searched over ARIMA model order using for-loops. Now that we have seasonal orders as well, this is very complex. Fortunately there is a package that will do most of this work for us.
This is the pmdarima package
The auto-underscore-arima function from this package loops over model orders to find the best one.
3. pmdarima results
The object returned by the function is the results object of the best model found by the search.
This object is almost exactly like a statsmodels SARIMAX results object and has the summary and the plot-diagnostics method.
4. Non-seasonal search parameters
The auto-arima function has a lot of parameters that we may want to set. Many of these have default values, but let's cover the important ones.
5. Non-seasonal search parameters
The only required argument to the function is data. Optionally we can also set the order of non-seasonal differencing; initial estimates of the non-seasonal orders; and the maximum values of non-seasonal orders to test.
6. Seasonal search parameters
If the time series is seasonal then we set the seasonal parameter to true.
We also need to specify the length of the seasonal period; and the order of seasonal differencing.
As with the non-seasonal parameters we can specify initial guesses and maximum values for the seasonal orders.
7. Other parameters
Finally, there are a few non-order parameters that we want to set.
We select whether to choose the best model based on AIC or BIC.
If trace is set to true then this function prints the AIC and BIC for each model it fits.
To ignore bad models, as in the try-except block that you wrote, you can set the error action to ignore.
If the last parameter, stepwise, is set to true then instead of searching over all model orders the function searches outwards from the initial model order guess using an intelligent search method.
8. Saving model objects
Once you have fit a model in this way, you may want to save it and load it later. You can so this using the joblib package.
To save the model we use the dump function from the joblib package. We pass the model results object and the filepath into this function.
9. Saving model objects
Later on, when we want to make new predictions we can load this model again.
To do this we use the load function from joblib
10. Updating model
Some time may have passed since we trained the saved model, and we may want to incorporate data that we have collected since then.
We can do this using the pmdarima model's dot-update method. This adds the new observations in df-underscore-new and updates the model parameters. This isn't the same as choosing the model order again and so if you are updating with a large amount of new data it may be best to go back to the start of the Box-Jenkins method.
11. Update comparison
Updating time series models with new data is really important since they use the most recent available data for future predictions.
Here are two dynamic forecasts of US candy production. The top forecast was made with data up to mid 2007, the bottom forecast was made after updating with data up to mid 2009. You can see that the updated model performs much better at future predictions.
12. Let's practice!
Alright, time to practice.