1. Autocorrelation
Autocorrelation is a very powerful tool for time series analysis. It helps us study how each time series observation is related to its recent past. Processes with greater autocorrelation are more predictable than those with none.
2. Autocorrelation - I
Let's start with Lag 1 autocorrelation. Consider again the hypothetical time series of stock prices for company A, called stock_A. The time series is shown on the left. You can compare each observation with the proceeding one, that is, you can compare Today versus Yesterday for each day. A scatterplot of each observation versus its previous observation is shown on the right. You can see a clear and moderately strong positive association between these observations. You can use the cor() function to calculate the correlation between them, and it is about 0-point-84. This indicates that when the previous stock price is relatively high the current stock price is likely to also be relatively high, and when the previous price is relatively low, the current price will tend to be relatively low.
3. Autocorrelation - II
The Lag 2 autocorrelation is defined similarly, but now you compare Today's price with the price two days before. A scatterplot of these pairs is shown in the figure, and you again see a clear and moderately strong positive association. This indicates that when the stock price is relatively high two days ago the current stock price is likely to also be relatively high, and when the price two days before is relatively low, the current price will tend to be relatively low. The correlation between these pairs is 0-point-76, which is still large, but not as big as the lag 1 autocorrelation, which was 0-point-84.
4. Autocorrelations at lag 1 and 2 - I
The autocorrelation function or ACF is simply the autocorrelation defined as a function of the time lag, 1, 2, etc. In R, you can apply the acf() function to the time series to easily estimate the autocorrelation
5. Autocorrelations at lag 1 and 2 - II
by lag for several lags simultaneously.
6. The autocorrelation function - I
To report the estimated values, set the additional argument plot = FALSE. Here we can see the autocorrelation estimates for all lags 1 through 10.
7. The autocorrelation function - II
To interpret these estimates as a function of the time lag we can make an autocorrelation function plot, you simply apply the acf() function to the time series, but set the additional argument plot = TRUE.
You can see the result in the figure. The time lag is indicated on the horizontal axis, and the height of each vertical line indicates the value of the estimated autocorrelation at that lag. In this example, the autocorrelations are largest at the low lags on the left, and they decrease to zero as the lag increases to the right. This implies that each observations is positively associated with its recent past, at least through 10 lags, but that the association becomes weaker as the lag increases.
8. Let's practice!
Excellent! Now let's finish this chapter with some autocorrelation exercises.