In a previous post, we examined the fundamental tools to test for stationarity on time series using Python, one of my favorite programming languages. If we use the tools described in the article, we will very soon realise that most time series are neither stationary nor mean reverting. In this new article we are going to examine how we can test two (or more) non-stationary time series to check whether the combined value is stationary.
This is where we should introduce the notion of cointegration .
If we are able to find a stationary linear combination of several time series that are not themselves stationary, then these are called cointegrated.
Cointegrated Augmented Dickey-Fuller Test
In the previous post, we saw how the ADF and Variance Ratio can test a given time series for mean reversion and stationarity, but we don’t know the number of units o percentage we should use to combine them into the stationary basket of elements we are looking for.
We have to be aware that just because a set of time series is cointegrating doesn’t mean that any random linear combination of the series will form a stationary basket of elements
To easily create the test we can use the procedure by Engle and Granger, which can be defined as the following steps:
- Determine the optimal hedge ratio by running a lineal regression fit between the two series.
- Use the hedge computed in step 1 to form a portfolio.
- Run a stationarity test on the portfolio created in step 2.
In order to test for cointegration of more than two variables, we have to use the Johansen test. If we start with the linear model we already described in the previous article:
We can generalize it to the case where the variable y(t) are vectors representing multiple series, and the coefficients λ and α are actually matrices (we are also going to assume βt=0 for simplicity) and we can rewrite the equation in the following way:
Just like in the previous case with just one variable, if λ = 0 we don’t have cointegration. Let’s assume the rank of λ is r and the number of time series is n. The number of independent baskets that can be formed by different linear combinations of the cointegrating series is equal to r. And the Johansen test will calculate that number for us in two different ways, both of them based on the eigenvector decomposition of λ: the first test produces the trace statistic, and the second one produces the eigen statistic.
Here you can find a complete implementation of the Johansen Test: