1. Stationarity and nonstationarity
Let's proceed with the basic concepts of stationarity, its importance, and how to coerce nonstationary data to stationarity.
2. Stationarity
In the context of time series, stationary refers to the
stability of the mean - that is, there is no trend
stability of the correlation - that is, the correlation structure of the data remains constant over time.
The time series plotted here may help in understanding stationarity better.
The left hand plot is stationary, there is no trend and the time series behaves the same, for example, between time points 1 to 50, 50 to 100, and so on.
On the other hand, the plot on the right looks very different between time point 1 to 50 and 150 to 200. The means in these time intervals are different, as is the variability the end of the series being more variable than the beginning.
3. Stationarity
Stationarity means that we can use simple averaging to estimate correlation:
If the mean is constant, then you can estimate it by the sample average, x-bar, and
If the correlation structure is constant, then for example, we can use all the pairs of data that are 1 time unit apart, (x_1, x_2), (x_2, x_3), and so on, to estimate lag 1 correlation. This works because the relationship between contiguous values of the series remains the same over time. Similarly, we can use (x_1, x_3), (x_2, x_4) and so on to estimate the lag 2 correlation.
4. Southern Oscillation Index
The Southern Oscillation Index is reasonably stable. it looks the same in any little segment of time (although there might be some slight trend).
5. Southern Oscillation Index
The scatterplots show correlation in terms of lag. This is called auto-correlation and is the same as the correlation you learned about in regression.
The graphic shows that the Southern Oscillation Index, which is a surrogate for sea surface temperature, is positively correlated with itself one month apart, but negatively correlated with itself six months apart (as it is hot in the summer and cold in the winter).
6. Random Walk Trend
The global temperature deviations is an example of a random walk where the value of the series at time t is the value it was at time t-1 plus a completely random movement. Differencing ("today minus yesterday") can make this kind of process stationary.
7. Trend Stationarity
The price of chicken is more like "trend stationarity", which is stationary behavior around a simple trend. Differencing works here too.
8. Nonstationarity in trend and variability
Finally, if there is trend and heteroscedasticity, logging and differencing can help as in the Johnson and Johnson earnings data set. First, logging positive-valued data can stabilize the variance. Second, differencing the data will detrend it.
9. Let's practice!