1. Box-Jenkins method
You've learned lots of tools and methods for working with and modeling time series. In this lesson you will learn about the best practices framework for using these tools.
2. The Box-Jenkins method
Building time series models can represent a lot of work for the modeler and so we want to maximize our ability to carry out these projects fast, efficiently and rigorously. This is where the Box-Jenkins method comes in.
The Box-Jenkins method is a kind of checklist for you to go from raw data to a model ready for production.
The three main steps that stand between you and a production-ready model are identification, estimation and model diagnostics.
3. Identification
In the identification step we explore and characterize the data to find some form of it which is appropriate to ARIMA modeling.
We need to know whether the time series is stationary and find which transformations, such as differencing or taking the log of the data, will make it stationary.
Once we have found a stationary form, we must identify which orders p and q are the most promising.
4. Identification tools
Our tools to test for stationarity include plotting the time series and using the augmented Dicky-Fuller test.
Then we can take the difference or apply transformations until we find the simplest set of transformations that make the time series stationary.
Finally we use the ACF and PACF to identify promising model orders.
5. Estimation
The next step is estimation, which involves using numerical methods to estimate the AR and MA coefficients of the data. Thankfully, this is automatically done for us when we call the model's dot-fit method.
At this stage we might fit many models and use the AIC and BIC to narrow down to more promising candidates.
6. Model diagnostics
In the model diagnostics step, we evaluate the quality of the best fitting model. Here is where we use our test statistics and diagnostic plots to make sure the residuals are well behaved.
7. Decision
Using the information gathered from statistical tests and plots during the diagnostic step, we need to make a decision. Is the model good enough or do we need to go back and rework it.
8. Repeat
If the residuals aren't as they should be we will go back and rethink our choices in the earlier steps.
9. Production
If the residuals are okay then we can go ahead and make forecasts!
10. Box-Jenkins
This should be your general project workflow when developing time series models. You may have to repeat the process a few times in order to build a model that fits well. But as they say, no pain, no gain.
11. Let's practice!
In the following exercise you will go through these steps to take an unknown time series and make a model ready for forecasting. Let's go!