Bitcoin’s power-law corridor debunked

Tim Stolte
Amdax Asset Management
8 min readSep 2, 2022

--

Three years ago in 2019, long-term Bitcoin price models popped up one after another. They all attempted to forecast Bitcoin prices a few years into the future. Some failed, but some happened to align with the price dynamics that came after it. The most famous one remains the Stock-to-Flow model by PlanB. It received a lot of criticism over time, and rightly so. Check out my previous article to find out what’s exactly wrong with it.

Another well-known prediction method was put forward a few months later by Harold Christopher Burger. He proposes a so-called power-law corridor growth model. It simply documents a relationship between the relative change in time and the relative change in Bitcoin price. The model hasn’t been criticised quite as much as Stock-to-Flow, but that doesn’t make it any less flawed.

In this piece, I’ll provide a deep-dive into the power-law model and attempt to explain why it’s a highly inadequate method to predict the Bitcoin price. My arguments include both logical and statistical reasoning, but do not require any prior knowledge on the matter.

Time does not pass exponentially

Interpreting the logic behind a model can be hard, but is absolutely crucial in any case. For a relevant example, I refer to my recent article on PlanB’s “quant investing strategy”. In the power-law model, Harold Christopher Burger starts by plotting the Bitcoin price series (on the y-axis) over time (on the x-axis). After observing a highly non-linear relationship between the two, he continues by logarithmically scaling both axes. This eventually results in a seemingly linear relationship between the logarithm of price (log-price) and the logarithm of time (log-time):

Logarithmically scaling time is possibly the weirdest thing I have ever seen in time series analysis. The whole point of log-scaling is to ensure that change is interpreted in relative terms.

Let’s imagine the price of an imaginary asset. If the price is not scaled, a change from 2 to 3 is considered equal to a change from 1 to 2, since there is only one unit of change in both cases. But we all know that this yields a distorted picture, because a change from 2 to 3 is much smaller in relative terms (50% vs 100% increase). We thus logarithmically scale the variable such that a change from 2 to 3 is considered twice as small as a change from 1 to 2.

A scaled time axis, on the other hand, makes no sense at all, because we want the change from 2022 to 2023 to be exactly equal to the change from 2011 to 2012. The amount of time in a year is constant. If we were to log-scale time, we are effectively modelling the real-world time to pass increasingly faster. That is ridiculous.

So what actually happens if we apply this transformation? It has numerous consequences, but I’ll name three here.

Robustness

Sample periods matter a lot for this particular method. He applies log-scaling to make the plot resemble a linear relationship. But watch what happens when I adjust the starting date to 2014.

Anyone can see that this relationship is miles away from a linear one. And that is just by simply moving the starting date a little into the future. We thus find that we can show the model’s inability by playing around with just one single parameter. In statistics, good models are little affected by underlying assumptions, which is a property called robustness. This power-law growth model clearly doesn’t belong to those models.

Logarithmic dates

It’s impossible to actually apply a logarithm to a date. It only works on positive numbers. So in order to create this plot, I converted the dates to positive integers, starting with 1 on the first date, 2 on the second date and so on until the last date. But it appears there’s a problem with that approach.

In the figure below, I included log-log plots with different starting integers. Note that the straightforward approach (starting with 1) produces nothing like a linear relationship. The same holds for very large starting integers like 1000, which look like common log-plots. He uses an absurdly arbitrary starting integer that equals the number of days since 1 January 2009, which equals 577 in my case (it’s 563 in the original model). There’s no logic or wisdom there, just pure guesswork and picking whatever looks nice...

Linear approximation

He identifies the linear approximation of the log-log plot as follows:

where d denotes the number of days since 1 January 2009. Estimating b yields 5.88 in my case. We can interpret this number as follows:

As the time progresses 1%, the price increases 5.88% on average.

Again, note the ambiguity of this interpretation. How can time move 1%? This time movement is fully dependent on the arbitrary starting date of 1 January 2009. Moreover, the further we move away from 2009, the more days fit into that 1% time interval. So we are not able to convert that 5.88% to average daily return or anything similar, giving us basically no understanding of the estimation results.

Stop using regressions

My second argument is on the methodology of the analysis. He makes his prediction based on the linear approximation, which he obtains by means of a linear regression. I am strongly opposed to this particular method for this particular purpose. If you want to make price predictions, don’t use regressions, but just calculate some growth rate and extrapolate it into the future. Please consult my previous article on this particular topic.

Why does he still use a regression for his predictions? One reason is that it sounds sophisticated even though it’s very easy to implement. But another reason is so that he can use an R-squared. Generally speaking, an R-squared is a measure ranging from 0 to 1 that indicates how well the model explains the actual data (often referred to as the model fit). Or in terms of this case, how well the log-log plot is described by a linear relationship. But that is the general case, and this particular case happens to fall short of that…

Consider the following log-log plot together with the model’s R-squared from 2014 onwards. It is similar to the one presented in his article. The further we move along the time axis, the higher the R-squared. It is also very close to 1, its maximum value. His conclusion is that his model performs exceptionally well and even increasingly better over time. This is, as it appears, a false conclusion.

In statistics, there is a well-known case in which the R-squared of a linear regression approaches its maximum value of 1 as we add more and more observations, regardless of how good the model fit is. This happens if both variables (log-price and log-time) are 1) non-stationary and 2) not cointegrated.¹ Let’s go ahead and walk through both of these conditions.

1) Non-stationarity

A variable is non-stationary if its statistical properties (like mean and variance) are not constant over time. Non-stationarity holds true for any ever-increasing function, so the log-time is non-stationary by construction. For the log-price, things are a bit harder, so we turn to the Augmented Dickey-Fuller test. That is a statistical test that tells us whether a variable is stationary. I run the test using a multitude of sample periods. In particular, I set the starting date at 1 August 2010 and vary the ending date from 1 January 2019 until 1 August 2022 with weekly intervals. In the figure below, the relevant results are displayed.

The grey line is the 5% significance level, a commonly used critical level. If the test’s p-value is below this line, we reject non-stationarity and conclude that a variable is stationary. Note that the p-value falls below this line from May 2022 onwards but is usually well above the level. Hence, we are not able to firmly reject non-stationary and conclude that there are signs of non-stationarity in the log-price.

2) Cointegration

Cointegration is an abstract term that refers to two non-stationary variables moving around each other. It is intuitively explained by the example in this paper². To analyse the presence of cointegration, we turn to the Engle-Granger cointegration test. This test works in a similar way to the previous example. So we once again extract relevant p-values from the test over multiple sample periods, which are shown below. These paint a very clear picture. The p-values are well above the 5% significance level. So we do not reject the hypothesis that there is no cointegration between the two variables. In other words, there is no clear sign of cointegration.

Since both of the previously mentioned conditions hold, we are dealing with a so-called spurious regression. This spuriousness means that a linear regression has no value as it does not yield valid results. Hence, the increasing R-squared as reported in the article does not say anything about the fit of this model.

Conclusion

In this article, we have concluded that the power-law model is logically and statistically invalid. Logarithmically scaling time is irrational and has severe implications on the model as a whole. Furthermore, statistical theory renders the results from the linear regression useless.

Even though I disagree with his approach, I understand his goal. He seeks to adjust for the diminishing returns in the Bitcoin price series. After this adjustment, the price plot more or less resembles a linear relationship. But making something look nice and familiar doesn’t mean that it’s useful or legit. Especially not when the transformations are built upon ambiguous and arbitrary assumptions. We might as well just draw price predictions by hand while we’re at it.

References

[1] Phillips, P. C. (1986). Understanding spurious regressions in econometrics. Journal of econometrics, 33(3), 311–340.

[2] Murray, M. P. (1994). A drunk and her dog: an illustration of cointegration and error correction. The American Statistician, 48(1), 37–39.

--

--

Tim Stolte
Amdax Asset Management

Quantitative Researcher at Amdax. Master’s degree in Econometrics / Quantitative Finance.