How do we measure the correlation of time series? Pearson correlation analysis

When you discover that your time series have the similar trend, you may want to measure how much are they correlated. In that case, the Pearson correlation coefficient is one of the most widely used value, developed by Yule in 1926. If you are interested to know more about that, this paper may be relevant to you. For others who want to calculate the Pearson value, Scipy library provides a function named “pearsonr”. Alternatively, numpy library has the function named “corrcoef”. Here is my example:

(1) It can be easily discovered that the two plots have the similar trend, even though the scale of y values are different.

(2) By using pearsonr function of Scipy library, we calculate the Pearson correlation coefficient. Here are the Python codes.

from scipy.stats.stats import pearsonr
import numpy as np
import sys
if __name__ == "__main__":
list1 = [241, 69, 72, 143, 128, 68, 126, 82, 126, 108, 68, 90, 81, 60, 72, 93, 80, 97, 65, 74, 71]
list2 = [621711, 190310, 204282, 319612, 367879, 200600, 329108, 226406, 399833, 253989, 233108, 301069, 257548,
206579, 255322, 268418, 279106, 304694, 216643, 236923, 254406]

if len(list1) != len(list2):
print("error, two series should contain same size of elements")

# scipy library
print("scipy result: ", pearsonr(list1, list2))

# numpy library
print("numpy result: ", str(np.corrcoef(list1, list2)))

(3) As a result, we see that two series are highly correlated, with a Pearson coefficient value as 0.94.

(4) Alternatively, you may want to see how that value will be affected when we change one single value from the series. To discover that, change the value of the last element for the list1 from 71 to 710.

(5) You will observe that the Pearson score decreased significantly from 0.94 to 0.21.

Originally published at Emre Calisir.