Information-Theoretic Alternatives To Pearson’s Correlation And Portfolio ‘Beta’
This is the second part of a two-parts post illustrating the practical importance of accounting for both nonlinearities and temporal dependencies when assessing portfolio risk, which the widely adopted (Pearson’s) correlation coefficient fails to do.
In Part I we provided a basic introduction to Pearson’s correlation, its relation to linear regression and portfolio beta, and its limitations as far as measuring dependence between assets is concerned.
In this post, we provide empirical evidence that the i.i.d. Gaussian assumption for asset returns does not hold for U.S. stocks and futures, and we present an alternative to Pearson’s correlation, namely the information-adjusted correlation, which measures the association between time series (not random variables), while fully capturing nonlinearities and, more importantly, temporal structures. We then use information-adjusted correlation to construct an information-theoretic alternative to the (CAPM’s) beta of a portfolio relative to the market, which we call information-adjusted beta.
Measuring Time Series Association With Information Theory
Note: All logarithms in this section are base 2.
Entropy As A Measure Of Uncertainty
The amount of information contained in a complex system modeled as a random variable is typically defined as the amount of uncertainty in its associated random variable.
Measuring the amount of uncertainty in a random variable is a problem that is as old as information theory itself. The canonical solution to this problem is the notion of information entropy introduced by Claude Shannon, the father of information theory, in his seminal paper A Mathematical Theory Of Communication, in which he focused on discrete random phenomena (i.e. those taking a countable number of values).
The notion of information entropy introduced by Shannon for discrete random variables was later generalized to any random variable.
An important related measure is the so-called conditional entropy. Intuitively, the conditional entropy of a random variable y given x is the amount of information/uncertainty that remains about random variable y given random variable x.
More specifically, it is the difference between the amount of uncertainty (or entropy) there is in y and x collectively, and the amount of uncertainty there is in x.
As is illustrated in the Venn diagram above, the amount of information contained in y and x collectively is rarely the sum of the amount of information contained in y and that contained in x, as there could be information redundancy between y and x.
One of the beauties of using entropy as a measure of information is the fact that the conditional entropy of y given x is never greater than the entropy of y, and the two are equal if and only y and x are independent (i.e. there is no association between the two whatsoever, linear or nonlinear).
Unlike Pearson’s correlation, conditional entropy captures both linear and nonlinear associations between random variables.
A related measure of association is the mutual information between y and x, defined as difference between the entropy of y and the conditional entropy of y given x which, as the name suggests, reflects the amount of information shared between y and x.
As it turns out, this quantity coincides with a formal and popular statistical measure of how far off we would be if we were to assume that y and x are independent, namely the so-called Kullback-Leibler divergence.
In short, even if we were to assume that asset returns are i.i.d., we can borrow from information-theory to construct a measure of association that, unlike Pearson’s correlation, fully captures both linear and nonlinear associations.
Entropy Rate As A Measure Of Information In Time Series
The notion of time plays too central a role in economics and financial markets to believe that order doesn’t matter, and that the same random phenomenon keeps repeating itself. Simply put, assuming returns are i.i.d., more often than not, is wrong. The natural probabilistic abstraction to model financial markets is the notion of a stochastic process or a time series, not the notion of a random variable.
A time series is basically a time-stamped collection of random variables.
Fortunately, the notions of entropy and conditional entropy are extended to time series by the notions of the entropy rate of a time series
and the conditional entropy rate of a time series given another
Their interpretations are very similar. The entropy rate measures the amount of information produced per unit of time by a time series. The conditional entropy rate measures the amount of new information produced by a time series per unit of time, that is not already contained in another time series.
Similarly to the random variable case, the difference between the entropy rate of a time series and its conditional entropy rate given another time series reflects the amount of information shared between the two time series per unit of time.
Crucially, the notion of conditional entropy rate goes well beyond linear associations of samples corresponding to the same time, and captures any association between the two time series, linear or nonlinear, and across time.
Pairwise Incremental Diversification As A Measure Of Dependence Between Assets
In our Yellow Paper, we define the amount of diversification an asset adds to the other as the mutual information timescale (inverse of the rate of mutual information) of time series of returns:
Intuitively, this quantity can be interpreted as the amount of time it would take on average to see 1 bit of shared information between the two assets (or equivalently their time series of returns). The less related two assets are, the longer it would take to observe 1 bit of mutual information between their returns time series. Similarly, the more related two assets are, the less time it would take to see 1 bit of mutual information between the two.
Incremental diversification is always positive, and varies between 0 (when one time series of returns can be fully determined from the other) and +∞ (when the two time series of returns are independent).
From Incremental Diversification To Information-Adjusted Correlation
The alert reader has certainly noticed that we haven’t made any specific distribution assumption to ensure that our notion of incremental diversification fully captures any form of association between two time series of returns, linear or nonlinear, at the same time or across time. Moreover, it is possible to estimate incremental diversification from empirical evidence without placing any arbitrary distribution assumption (see Yellow Paper our for more details).
Now, here’s the deal. We known that in the case of i.i.d. Gaussians, Pearson’s correlation is sufficient to characterize any form of association, linear or otherwise. This begs the question: what is the functional relationship between incremental diversification and Pearson’s correlation in the case of i.i.d. Gaussians? It turns out that the answer is available in closed form:
We can also ask the reverse question. Given that we know how to accurately estimate incremental diversification, what Pearson’s correlation coefficient would the estimated incremental diversification value correspond to under the i.i.d. Gaussian assumption? The answer to this question — obtained by inverting the equation above — is what we refer to as information-adjusted correlation.
We can then independently estimate Pearson’s correlation and compare it to information-adjusted correlation. If the Gaussian i.i.d. assumption is valid, then the two values should be close to each other!
A Simple And Practical Black-Swan Test
It sounds nice and all but where are the black-swans you might ask! Well, if there is any practical takeaway from this post this is it:
- Read our Yellow Paper to figure out how to estimate ACorr from data.
- Case 1: ACorr ≈ Corr: If you observe that information-adjusted correlation is (approximately) equal to Pearson’s correlation, then the i.i.d. Gaussian assumption holds, and you can trust your favorite linear i.i.d. factor model.
- Case 2: |ACorr| < |Corr|: Sorry, but there is a bug in your code! This is mathematically impossible.
- Case 3: |ACorr| >> |Corr|: Red flag! There is a whole lot of risk in your portfolio that neither Pearson’s correlation nor your favorite linear i.i.d. factor model is accounting for, and that will come bite you hard in big market moves. Any portfolio of yours in these assets that you think is market-neutral is probably not market-neutral at all!
Does It All Matter? You Bet It Does!
As previously discussed, if we plot information-adjusted correlation against Pearson’s correlation for some pairs of assets, any significant deviation from the line y=x is a strong indication that the i.i.d. Gaussian assumption for asset returns doesn’t hold.
Well, let’s do just that. Let’s consider as universe of assets the constituents of the S&P 100 and 60 of the most liquid U.S. futures (front-month continuously adjusted with the backward ratio method). For each pair of assets in the universe we compute both the Pearson’s correlation between their daily returns and the information-adjusted correlation between their daily returns, we plot one against the other in a scatter plot, and we get the following chart.
Let’s analyze the chart.
Observation 1: We see that the closer Pearson’s correlation is to 1 (resp. -1), the closer information-adjusted correlation is to 1 (resp. -1). This makes intuitive sense. Pearson’s correlation captures linear associations between returns corresponding to the same time. Strong evidence of this specific form of association does imply strong evidence of association between the underlying time series of returns, which is what information-adjusted correlation captures.
Observation 2: However, we see that information-adjusted correlation does not go to 0 as Pearson’s correlation goes to 0. Intuitively, the lack of evidence of linear association between daily returns corresponding to the same time (i.e. weak Pearson’s correlation) does not in general imply evidence of the lack of association between the two underlying time series of daily return. This is true in the special case of jointly Gaussian white noises, but certainly not in general. In general, there could be other forms of associations (e.g. nonlinear associations, temporal dependencies etc.) that would be captured by information-adjusted correlation but not Pearson’s correlation.
The fact that the scatter plot above deviates significantly from the y=x line is sufficient empirical evidence that the i.i.d. Gaussian assumption does not hold for daily returns of U.S. stocks and futures!
Main Observation: You see those pairs with 0 Pearson’s correlation on the vertical axis? None of them have 0 information-adjusted correlation! A Pearson’s correlation of 0 between liquid exchange-traded U.S. assets can hide up to a 0.3 ‘real’ correlation, which can only arise through nonlinearities (i.e. fat tails) or temporal dependencies (i.e. butterfly effects), both of which can be source of black-swan events.
Basically, linear i.i.d. factor models do not accurately capture risk in liquid U.S. stocks and futures!
Information-Adjusted Portfolio Beta
As discussed in Part I, the (CAPM’s) beta of a portfolio can be obtained as
A simple generalization of this measure to capture both nonlinear and temporal dependencies between the portfolio’s returns and those of the market is obtained by replacing Pearson’s correlation with information-adjusted correlation. We call the resulting measure information-adjusted portfolio beta.
A direct consequence of the discussion above is that a portfolio with 0 information-adjusted beta has returns time series independent from that of the market, and therefore is truly independent from the market, truly market-neutral.
Final Words
In our Yellow Paper we introduce an information-theoretic alternative to Pearson’s correlation, namely the information-adjusted correlation, that fully captures nonlinearities and temporal dependencies between time series of returns in a model-free fashion.
We use the information-adjusted correlation to construct an alternative to the CAPM’s beta of a portfolio, namely the information-adjusted beta, that captures any association between a portfolio and the market (linear and nonlinear, at the same time, or across time).
We illustrate that the i.i.d. Gaussian assumption for asset returns is inconsistent with empirical evidence in U.S. stocks and futures, which attests to the practical importance of information-adjusted alternatives proprosed.
Crucially, we illustrate that Pearson’s correlation, CAPM’s beta, and other i.i.d. linear factor models can hide a significant amount of financial risk that will reveal itself as black-swan events.