1. Cointegration Models
The idea behind cointegration is
2. What is Cointegration?
that even if the prices of two different assets both follow random walks, it is still possible that a linear combination of them is not a random walk. If that's true, then even though P and Q are not forecastable because they're random walks, the linear combination is forecastable, and we say that P and Q are cointegrated.
3. Analogy: Dog on a Leash
The best analogy I've heard is of a dog owner walking his dog with a retractable leash. If you look at the position of the dog owner, it may follow a random walk, and if you look at the position of the dog separately, it may also follow a random walk, but the distance between them, the difference of their positions, may very well be mean reverting: if the dog is behind the owner, he may run to catch up and if the dog is ahead, the length of the leash may prevent him from getting too far ahead. The dog and its owner are linked together and their distance is a mean reverting process.
4. Example: Heating Oil and Natural Gas
Both Heating Oil prices and Natural Gas prices look like they're random walks. But when you look at the spread,
5. Example: Heating Oil and Natural Gas
or difference between them, the series looks like it's mean reverting. For example, when heating oil spiked down relative to natural gas in 2001, the spread reverted back.
6. What Types of Series are Cointegrated?
With commodities, there may be economic forces that link the two prices. Consider heating oil and natural gas. Some power plants have the ability to use either one, depending on which has become cheaper. So when heating oil has dipped below natural gas, increased demand for heating oil will push it back up. Platinum and Palladium are substitutes in some types of catalytic converters used for emission control. Corn and wheat are substitutes for animal feed. Corn and sugar are substitutes as sweeteners, etc.
How about bitcoin and ethereum? In one of the exercises, you'll look at whether they are cointegrated.
For stocks, a natural starting point for identifying cointegrated pairs are stocks in the same industry. However, competitors are not necessarily economic substitutes. Think of Apple and Blackberry. It's not necessarily the case that when one of those company's stock price jumps up, the other catches up. In this case, it's more like the dog broke the leash and ran way from the owner.
7. Two Steps to Test for Cointegration
You can break down the process for testing whether two series are cointegrated into two steps. First, you regress the level of one series on the level of the other series, to get the slope coefficient c. Then, you run the Augmented Dickey-Fuller test, the test for a random walk that you learned about in the second chapter, on the linear combination of the two series. Alternatively, statsmodels has a function coint that combines both steps.
8. Let's practice!
Now let's try some examples.