Backtesting Bitcoin’s Stock to Flow Model

Rob Wolfram
9 min readDec 28, 2019

--

bc1qhamal03e2znu7rq32cz9d9duq4lvhzhczrz4n6
----BEGIN BITCOIN SIGNED MESSAGE-----
Backtesting Bitcoin's Stock to Flow Model
-----BEGIN SIGNATURE-----
Hxdkh317fjowIwXYM4HlbwZDxXMofPPc8NiD+PYwwaR8TJP
2uTQiomSPSiBRMvxthKxDNhe zONxJdjGpp3ECjRc=
-----END BITCOIN SIGNED MESSAGE-----

About the author

I am Rob Wolfram. I am a Linux sysadmin in the Netherlands with quite some interest in security and cryptography and because of that I am a fan and promoter of Bitcoin. I consider myself a Bitcoin maximalist, with which I mean that I think that in the long run other crypto-currencies will only play a marginal role.

One caveat: I am not a statistician and not an trader. This post is done on a personal basis and I cannot claim that my conclusions or even my thought processes have any merit. This article is not financial advice!

Introduction

In his seminal book “The Bitcoin Standard”, the Economics academic Saifidean Ammous explained the stock to flow ratio as a means to quantify the hardness of an asset and applied this to Bitcoin. In march 2019 the pseudonymous Twitter and Medium user “Plan B” shook the Bitcoin community to its core by creating a model and showing that a linear relation exists between the logarithm of Bitcoin’s stock to flow ratio and the logarithm of it’s market capitalization, thus showing a power law growth of Bitcoin’s value. Two other statisticians scrutinized the model (“Nick” here and Marcel Burger here) and concluded that the two parameters are not only correlated but actually co-integrated.

The model was an inspiration for multiple people, including myself, to create web sites with auto-updating charts based on this model. There is however one catch with such charts (at least there is with mine): the linear regression on the two parameters is recalculated every time. The values of an extrapolation into the future of today are likely different from next month’s values. That inspired me to test the model with older values. If someone would apply the model in the past, would they get to a prediction that is within the ball park of today’s value?
TL;DR: I think they would.

Methodology

I used the publicly available data from CoinMetrics and calculated an OLS regression of the logarithm of Bitcoin’s average exchange price in US dollars and the logarithm of the stock to flow ratio for various subsets of the data in the past and compared a prediction of a recent subset of the data (which did not overlap with any of the subsets used to calculate the regression) to the real data of that subset. I ignored data from before July 18, 2010 because by then no price information was available.

Both in my calculations as well as in the resulting graphs and table I use the natural log (log based e) but any other log base would work equally well since another log base would differ a constant factor from the natural log.

Every approximately 4 years (210,000 blocks) there is a “halving event” in Bitcoin mining. That means that the reward in new bitcoins for finding a new block is reduced by half. This has happened twice before in Bitcoin’s history. Since such an event causes a steep increase of the stock to flow ratio, I took previous halving events into account for choosing the subsets from which I calculated the regression. In the results segment I refer to these as the “reference set”.

The periods that I used to calculate the regression on are:
• from the start (2010-07-18) to a year after the second halving event (2017-07-08)
• from the start (2010–07–18) to (the day before) the second halving event (2016–07–08)
• from the start (2010-07-18) to (the day before) the first halving event (2012-11-27)
• Between the two halving events (2012-11-29 to 2016-07-08)
• Periods of two years around the halving events (2011-11-29 to 2013-11-27 and 2015-07-10 to 2017-07-08)
• Before the second halving, excluding the 2 year periods around the halving events (2010-07-18 to 2011-11-27 and 2013-11-28 to 2015-07-08).

The data for which I test the prediction is from one year after the second halving (2017-07-09) to a recent cut-off date (2019-11-30). The cut-off date is chosen so that redoing the experiment will yield the same numbers but it serves no other purpose. I just compare an average of the logarithm of the real price with an average of logarithm of the predicted price given the various regression lines. I compare logarithmic values of the price since the regression is done with those. In the results section I refer to these values as the “test set”. The price is based om American dollars and the earliest prices are sub-dollar values so the logarithm is negative for those times. To have a reasonable idea of the error I calculate the errors based the difference from the logs of the first available price of the subset used for the regression to the average log value in the test set.

For each period, the stock to flow ratio after the maximum date is extrapolated to the cut-off date based on 144 new blocks per day. The rationale is that someone who would do the test at that day in the past would not know the real stock to flow ratios in the future but extrapolate into the future as well.

Often better charts are gained when 1 million coins are subtracted from the stock. The rationale is that Satoshi’s coins are estimated to be around a million and should be considered zombie coins. I opted to not subtract them because in the past, especially during the first halving period, they were not known to be zombie coins. It seems that CoinMetrics does remove the known lost coins (i.e., OP_RETURN transactions with a value and unclaimed coin base values) from the stock, so I don’t have to do that myself.

In the resulting graphs I included the regression line and price prediction based on the complete data set, including one standard error above and below that. That is denoted by a green band.

Results

Near all time (Jul 2010 to Jul 2017)

This emulates the graph that I update daily as if that graph was created in July 2017. As expected, the result does not deviate a lot from the average value in the tested period since it immediately precedes it (the expected growth is 2.3% harder than in reality). The test set falls in a period between two halvings where the stock to flow ratio does not vary a lot. Volatility during that period is not primarily caused by the slowly increasing stock to flow ratio.

First two halving periods (Jul 2010 to Jul 2016)

This data fits pretty well and overshoots the growth to the average of the test set by 7.2%. Even though that results in a predicted price that is more than double the average real price in the tested time period, a doubling seems to me to be well within scope with a growth of nearly five orders of magnitude.

First halving period only (Jul 2010 to Nov 2012)

As can be expected, this results in a prediction that fits less well than the previous one, but overshooting the growth by 12.3% (a prediction of four times the actual value compared to the nearly five orders of magnitude) is better than I expected. Although the fitness of the regression with the reference set is not great (with an R squared value of 61%), the fitness of the resulting line with the complete data set is not bad.

Second halving period only (Nov 2012 to Jul 2016)

Even though the predicted price does not deviate to much from the actual price in the tested period (it undershoots the average price by 7.6%), the prediction line does not fit the full data set of the first halving period very well. The predicted price for the start of the full data set is nearly ten times as high as the expected price based on a regression of the full data set. This result surprised me. It seems that the regression line is skewed quite a bit by the two price peaks totaling nearly two orders of magnitude in 2013.

Time around halving events (Nov 2011 to Nov 2013 and Jul 2015 to Jul 2017)

After a halving event the stock to flow ratio increases dramatically and volatility is to be expected during this period. Still, except for the near all time graph, this has the best fit of all tested sets (with an R squared score of 93% when compared to the full data set). It undershoots the average from the test set by just 3.7%. It can be expected that it deviated the test set just a little since the test set directly follows the reference set, but the regression is completely within one standard error of the regression of all data points. This too, surprised me.

Time not around halvings (Jul 2010 to Nov 2011 and Nov 2013 to Jul 2015)

Here the fit is not that great, but just like with the regression on only the first halving period, the prediction of the test set average is not completely absurd (overshooting by 13%). The log/log chart here (log of stock to flow vs log of price) shows pretty clearly that the volatility is relatively large during the times that the stock to flow ratio doesn’t increase a lot. Even though this regression has the worst fit on the total data, that “worst fit” still has an R squared score of more than 85%.

Conclusion

I am very aware that I’m skating on thin ice here but I’ll wager an opinion anyway. I think that even the worst fitting prediction line makes a future prediction that is within a ball park of the actual result, given the very high rise that bitcoin has had in the past ten years. It seems that the periods when the stock to flow ratio is rising significantly give the best prediction of the future value. I look forward to seeing the effect of the next halving event in May 2020.

Data overview

The following table has some numbers on the difference between predicted and actual values and the fitness quality of the regression.

Source code

The code for generating these graphs is available at my Gitlab page.

--

--