Making time series stationary
1. Making time series stationary
Last time we learned about ways in which a time series can be non-stationary, and how we can identify it by plotting.2. Overview
However, there are more formal ways of accomplishing this task, with statistical tests. There are also ways to transform non-stationary time series into stationary ones. We'll address both of these in this lesson and then you'll be ready to start modeling.3. The augmented Dicky-Fuller test
The most common test for identifying whether a time series is non-stationary is the augmented Dicky-Fuller test. This is a statistical test, where the null hypothesis is that your time series is non-stationary due to trend.4. Applying the adfuller test
We can implement the augmented Dicky-Fuller test using statsmodels. First we import the adfuller function as shown, then we can run it on our time series.5. Interpreting the test result
The results object is a tuple. The zeroth element is the test statistic, in this case it is -1.34. The more negative this number is, the more likely that the data is stationary. The next item in the results tuple, is the test p-value. Here it's 0.6. If the p-value is smaller than 0.05, we reject the null hypothesis and assume our time series must be stationary. The last item in the tuple is a dictionary. This stores the critical values of the test statistic which equate to different p-values. In this case, if we wanted a p-value of 0.05 or below, our test statistic needed to be below -2.91.6. Interpreting the test result
We will ignore the rest of the tuple items for now but you can find out more about them here.7. The value of plotting
Remember that it is always worth plotting your time series as well as doing the statistical tests. These tests are very useful but sometimes they don't capture the full picture.8. The value of plotting
Remember that Dicky-Fuller only tests for trend stationarity. In this example, although the time series behavior clearly changes, and is non-stationary, it passes the Dicky-Fuller test.9. Making a time series stationary
So let's say we have a time series that is non-stationary. We need to transform the data into a stationary form before we can model it. You can think of this a bit like feature engineering in classic machine learning.10. Taking the difference
Let's start with a non-stationary dataset. Here is an example of the population of a city. One very common way to make a time series stationary is to take its difference. This is where, from each value in our time series we subtract the previous value.11. Taking the difference
We can do this using the dot-diff method of a pandas DataFrame. Notice that this gives us one NaN value at the start since there is no previous value to subtract from it.12. Taking the difference
We can get rid of this using the dot-dropna method.13. Taking the difference
Here is the time series after differencing. This time, taking the difference was enough to make it stationary, but for other time series we may need to take the difference more than once.14. Other transforms
Sometimes we will need to perform other transformations to make the time series stationary. This could be to take the log, or the square root of a time series, or to calculate the proportional change. It can be hard to decide which of these to do, but often the simplest solution is the best one.15. Let's practice!
You've learned how to test for stationarity and make time series stationary. Now let's practice!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.