TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial…

A Quick Introduction On Granger Causality Testing For Time Series Analysis

Augmented Dickey-Fuller (ADF) test, Kwiatkowski–Phillips–Schmidt–Shin (KPSS) tests, Vector Autoregressions (VA), Durbin–Watson statistic, Cointegration test

Susan Li
TDS Archive
Published in
5 min readDec 23, 2020

--

The Granger causality test is a statistical hypothesis test for determining whether one time series is a factor and offer useful information in forecasting another time series.

For example, given a question: Could we use today’s Apple’s stock price to predict tomorrow’s Tesla’s stock price? If this is true, our statement will be Apple’s stock price Granger causes Tesla’s stock price. If this is not true, we say Apple’s stock price does not Granger cause Tesla’s stock price.

The Data

So, let’s go to Yahoo Finance to fetch the adjusted close stock price data for Apple, Walmart and Tesla, start from 2010–06–30 to 2020–12–18.

Visualize the Time Series

Time series can be represented using either line chart or area chart.

Apple and Walmart time series have a fairly similar trend patterns over the years, where Tesla Stock IPOed just over 10 years ago and it has surprised everyone with over 700% rise year-to-date in 2020.

ADF Test for Stationarity

The ADF test is one of the most popular statistical tests. It can be used to help us understand whether the time series is stationary or not.

Null hypothesis: If failed to be rejected, it suggests the time series is not stationarity.

Alternative hypothesis: The null hypothesis is rejected, it suggests the time series is stationary.

The p-values are all well above the 0.05 alpha level, we cannot reject the null hypothesis. So the three time series are not stationary.

KPSS Test for Stationary

The KPSS test figures out if a time series is stationary around a mean or linear trend, or is non-stationary due to a unit root.

Null hypothesis: The time series is stationary

Alternative hypothesis: The time series is not stationary

The p-value are all less than 0.05 alpha level, therefore, we can reject the null hypothesis and derive that the three time series are not stationary.

After cross-check ADF test and KPSS test. We can conclude that the three time series data we have here are not stationary. We will transform the time series to be stationary by difference method.

Difference Method

ADF Test Again

After transforming the data, the p-values are all well below the 0.05 alpha level, therefore, we reject the null hypothesis. So the current data is stationary.

KPSS Test Again

Some of the KPSS Null Hypothesis could not be rejected.

VAR Model

The VAR class assumes that the passed time series are stationary. Non-stationary or trending data can often be transformed to be stationary by first-differencing or some other method.

There is no hard-and-fast-rule on the choice of lag order. It is basically an empirical issue. However, it is often advised to use the AIC in selecting the lag order with the smallest value. Therefore, we will select lag order = 15.

results = model.fit(maxlags=15, ic='aic')
results.summary()

The biggest correlation is 0.43 (Apple & Tesla).

Durbin-Watson Statistic

The Durbin Watson Test is a measure of autocorrelation in residuals from regression analysis.

A value of 2.0 means that there is no autocorrelation detected in the residuals.

Granger Causality Test

The following code was borrowed from stackoverflow:

The row are the response (y) and the columns are the predictors (x). If a given p-value is < significance level (0.05), for example, take the value 0.0 in (row 1, column 2), we can reject the null hypothesis and conclude that walmart_x Granger causes apple_y. Likewise, the 0.0 in (row 2, column 1) refers to walmart_y Granger causes apple_x.

All the time series in the above data are interchangeably Granger causing each other.

Forecasting

Remember we transformed the data by difference method, now we will invert the transformation.

Jupyter notebook can be found on Github. Happy Holidays!

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Susan Li
Susan Li

Written by Susan Li

Changing the world, one post at a time. Sr Data Scientist, Toronto Canada. https://www.linkedin.com/in/susanli/

Responses (3)