Beyond traditional asset return modelling: Embracing thick tails.

Guide to statistical inference for preasymptotics.

8 min readJun 21, 2022

Statistical inference for asset returns can seem deceptively easy at first. The existence of large negative returns and the volatile nature of financial markets can severely complicate analysis. Hence, a plethora of literature focusses on overcoming the difficulties encountered in asset returns modelling. This article provides guidance for conducting a scientific rigorous analysis of asset returns.

Primer on returns and its application in finance.

Let us start with the definition of returns as provided by Princeton:

“A return is a percentage defined as the change of the price expressed as a fraction of the initial price. ‘’

The classical investing objective is to receive the highest return for the least amount of risk. Normally, practitioners use simple returns, as defined in Eq. 1, or logarithmic returns, as defined in Eq. 2.

Equation 1: Definition of simple return.

Equation 2: Definition of logarithmic return.

Equation 3 shows that, based on Taylor approximation, simple and logarithmic returns are approximately equal when returns are small.

Equation 3: Relation between logarithmic and simple returns for small returns.

Figure 1 shows that the approximation works well for returns in the range of -15% to 15%, i.e. the majority of empirical asset returns.

Figure 1: Relation between simple and logarithmic returns. Source: Author.

A critical reader would ask the intuitive question why do we not simply analyze prices instead of returns? Advantages of using returns are twofold:

Normalization: Measuring all price increases in a comparable metric, thus enabling evaluation of analytic relationships amongst multiple assets.
Stationarity: Asset prices are (usually) non-stationary which causes statistics, such as mean and variance, to change over time.

The question then still remains whether to use logarithmic or simple returns. Advantages of logarithmic returns are:

Non-existence of negative asset prices: Logarithmic returns assume log-normal prices which theoretically prevents negative prices.
Continuously compounded returns: For non-stochastic processes, such as the returns on risk-free fixed interest securities held to maturity, when logarithmic returns are used, the frequency of compounding does not matter and returns across assets can more easily be compared.
Time-additivity: Compounding logarithmic returns ensures normality because the sum of normally distributed variables is normal. This normality condition is violated for compounded simple returns.
Numerical stability: Adding small numbers is numerically safe, while multiplying small numbers is not as it is subject to arithmetic underflow.

The advantages of using simple returns are:

Direct measure of wealth change: Logarithmic returns do not give a direct measure of the change in wealth of an investor over a particular period. By definition, the appropriate measure to use for this purpose is the simple return over that period.
Mean logarithmic return is dependent on simple return variance: Hudson and Gregoriou (2010) show that the mean of a set of returns calculated using logarithmic returns is less than the mean calculated using simple returns by an amount proportional to the variance of the returns.

In this article, we focus on simple daily returns as measured from close-to-close. After substantiating the use of simple returns, we continue with statistical inference on asset returns and therefore introduce two related concepts.

Central Limit Theorem & Law of Large Numbers

In probability theory, the Law of Large Numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and tends to become closer to the expected value as more trials are performed.

The LLN is known in a weak and strong form. We introduce the weak LLN but encourage you to research the strong LLN.

Weak Law of Large Numbers

The Weak LLN states the sample average converges in probability towards the expected value. In mathematical notation, the theorem looks like Eq. 5:

We can derive the weak LLN by Chebyshev’s inequality and encourage you to study the derivation.

Central Limit Theorem

The standard (Lindeberg-Lévy) version of Central Limit Theorem (CLT) is as follows: As n approaches infinity, the sum of random variables converges in distribution to a Gaussian:

Empirical asset return distribution

To perform statistical inference, we need empirical asset returns which we source for GE, AAPL, and IBM during the trading years 2018 until 2021. Figure 2 shows that the kurtosis is higher than a normal distribution resulting in more observations in the tails of the empirical distribution.

Figure 2: Empirical asset return (upper) and tail (lower) distribution for GE, AAPL, and IBM during 2018 until 2021. Source: Author.

This imperfect fit in the tails is supported by an overwhelming body of empirical evidence that shows that asset returns do not follow a Gaussian distribution [4][5][6][7]. For example, Bouchaud et al. (2018) show that the tail distribution for a multitude of asset classes can be approximated by a power-law with an exponent approximately equal to 3.5.

Figure 3: Empirical return distributions of equities, CDS spreads, and implied volatility of options that can be described by a power-law with exponent equal to 3.5. Source: Trades, Quotes and Prices by Bouchaud et. al., (2018).

Thick-tailed asset returns and its statistical consequences

Let us first define a thick-tailed distribution:

A thick-tailed distribution is a distribution that has a higher kurtosis than a Gaussian distribution.

In layman terms, this boils down to any distribution with more observations within one standard deviation than 68.2% and a kurtosis higher than 3. The statistical consequence can best be explained by an example from finance. Suppose you are a trader and have to comment on the existence of a return of size larger than 10 sigma. If you believe in Gaussian returns, this event will only occur with negligible probability. If you, however, believe in a thick-tailed distribution of returns, this probability is certainly not negligible:

Equation 8: Probability of 10-sigma event for Gaussian (upper) and Student’s t distributed (lower) returns.

Bouchaud et al. (2018) state that on shorter time scales (between a minute and a few hours), the empirical density function of returns can be fit reasonably well by a Student’s t distribution, whose probability density function is given by:

Equation 7: Probability density function of a Student’s t distribution.

Let us verify the thick-tailedness of asset return by fitting a Student’s t distribution to the empirical returns of the S&P 500. Figure 4 shows that the empirical estimate of the tail exponent is approximately 3, which clearly indicates a thick-tailed distribution.

Figure 4: Empirical calibration of Student t distribution on daily S&P500 returns during 2018–2021.

Effect of thick tails on Law of Large Numbers

Let’s see what the implications are for the LLN if the underlying distribution is thick-tailed. For that, we perform a Monte Carlo simulation for a thin-tailed distribution (Gaussian), thick-tailed distribution (Student’s t) and a heavy-tailed distribution (Cauchy). The Student’s t distribution is calibrated on the daily S&P500 returns. Figure 5 shows that the sample mean converges slower to the true mean for a Student’s t distribution than for a Gaussian distribution of returns.

Figure 5: Monte Carlo simulation of LLN for thin-tailed Gaussian (upper), thick-tailed Student’s t (center), and heavy-tailed Cauchy (lower) distribution.

Taleb (2020) illustrates the statistical consequences of thick-tailed distributions in Figure 6 and shows that the confidence interval around the sample mean is larger than for a Gaussian distribution.

Figure 6: Convergence of standard error of sample mean for thin- and thick-tailed distribution. Source: Taleb (2020).

Statistical consequences for inference on asset returns?

Figure 7 shows the empirical sample convergence for the standard error is slower for Student’s t distributed returns compared to Gaussian distributed returns. In other words, by applying CLT in finite sample, we underestimate the standard error. We observe that the standard error for a Student’s t distribution with tail exponent 3 is 1.6 times the standard error for a Gaussian distribution. Put differently, by applying the CLT we underestimate the standard error by 37.5%.

Figure 7: Convergence of standard error for thin-tailed (Gaussian) and thick-tailed (Student t distribution with tail exponent 3) and its associated difference in standard error. Source: Author.

Analyzing asset returns? Apply adjustment factor to standard error!

Deriving standard errors based on thick-tailed distributed asset returns requires an elaborate understanding of the preasymptotic behavior of the Law of Large Numbers and the Central Limit Theorem. Assuming Gaussianity of the sample mean based on the Central Limit Theorem underestimates the empirical standard error and we recommend to multiply the standard error by a factor of 1.6.

Additional resources

The following tutorials / lectures were personally very helpful for my understanding of thick-tailed distributions and its effect on pre-asymptotic behavior of the Central Limit Theorem.

Academia

Professor John Tsitsiklis @ MIT.
Professor Nassim Nicholas Taleb @ Cambridge.

References

[1] M.L. de Prado, Advances in financial machine learning (2018), John Wiley & Sons, 2018.

[2] N. N. Taleb, Statistical consequences of fat tails: Real world preasymptotics, epistemology, and applications (2020), arXiv preprint arXiv:2001.10488

[3] J.P .Bouchaud, J. Bonart, J.Donier and M. Gould, Trades, quotes and prices: financial markets under the microscope (2018), Cambridge University Press.

[4] J. P. Bouchaud, Power laws in economics and finance: some ideas from physics (2001), Quantitative Finance, 1, 105–112.

[5] J. P. Sethna, K. A. Dahmen, and C.R. Myers, Crackling noise (2001), Nature, 410(6825), 242–250.

[6] A. Clauset, C. R. Shalizi, and M. E. J. Newman, Power-law distributions in empirical data (2009), SIAM Review, 51(4), 661–703.

[7] X. Gabaix, Power laws in economics and finance (2009), Annual Review of Economics, 1(1), 255–294.

[8] R. Hudson, A. Gregoriou, Calculating and comparing security returns is harder than you think: A comparison between logarithmic and simple returns (2010), International Review of Financial Analysis, Elsevier, 38(C), 151–162.

If you’re keen on reading more, see a selection of my articles below:

Multi-armed bandits applied to order allocation among execution algorithms

Finding the right balance between exploitation and exploration

towardsdatascience.com

Cost decomposition for a VWAP execution algorithm: Buy-side perspective.

Linear cost decomposition for VWAP execution algorithm that allows for faster and more granular algorithmic trading…

medium.com

Introduction to Probabilistic Classification: A Machine Learning Perspective

Guide to go from predicting labels to predicting probabilities