All models are wrong, and this one isn’t even useful

Tim Stolte
Amdax Asset Management
11 min readAug 16, 2022

--

“Quant investing is both art and science”

This is how crypto influencer and Stock-to-Flow creator PlanB introduced his new investing strategy that he generously shared with his followers. Of course, the two-edged definition of quant investing allows him to present himself as an artist whose undeniable skill and wisdom will inspire all crypto investors left, right and center. There’s nothing new there. He has been doing this for years and he’s actually really good at it. It’s the other side that worries me though…

By stating that the quantitative approaches to investment decisions sometimes fall into the subjective category of art, he’s taking a hidden stance against all scientific principles of the subject. It effectively renders all logic and reason irrelevant once the artist himself says so. While I do think that quantitative methods can be creative, elegant and sometimes even mysteriously unexplainable, calling it art is another thing entirely and potentially dangerous. Data should be used intelligently and with care, because the past has proven time and again how easy it is to frame and deceive people with it. And this may very well be one of those times.

In this article, I will discuss what is wrong with PlanB’s article and strategy in a structured and detailed manner. Along the way, I’ll also zoom in on some quantitative and investing principles such that no prior knowledge on both topics is required. Let’s dive straight in!

Trading rules

In order to maximise profit as an investor, there are two things that you must do: buy when the price is low and sell when the price is high. Easier said than done of course, because we can only identify these perfect opportunities (when the price tops and bottoms) in hindsight. The concept of quant investing provides a possible solution to this problem by aiming to predict tops and bottoms using hidden patterns in data.

PlanB’s approach is very simple: use only the Relative Strength Index (RSI) in the model to pinpoint tops and bottoms in the Bitcoin price and thus find optimal buying and selling opportunities. The RSI is an index that takes a value between 0% and 100% to determine whether an asset is overbought (high RSI; indicating a top) or oversold (low RSI; indicating a bottom). So all we need to do is sell if the RSI is high and buy when the RSI is low. Simple right? Let’s take a look at the trading rules in the article:

  • Sell if the RSI is below 65% now AND was above 90% in at least one of the last six months.
  • Buy if the RSI is more than 2% higher than the low AND was below 50% in at least one of the last six months.

If your first reaction is that these trading rules seem a little complicated for such a simple metric, you’re right, but we’ll get to that later on. Instead, we’ll first focus on the implications of the signals. Consider the following graph, with the Bitcoin price in grey and the RSI in blue:

We immediately spot something odd. Buy and sell signals only appear when the RSI is relatively low. This is in sharp contrast with the interpretation of the metric, because we expected to sell for high RSI values which would signal a top. We even find that January 2012 is both a sell AND a buy signal. The strategy interprets it as a buy signal simply because we have already sold and cannot sell any more. In other words, we buy even though the strategy tells us to sell. In this situation, the signal depends on our current positioning rather than our expectation of where the price is heading. It basically flies in the face of the whole idea behind quant investing. But hey, art has no rules…

Some readers may counter these arguments along the lines of the good old saying: “If it’s stupid and it works, it ain’t stupid”. To those, I propose the following four-step method in true PlanB fashion:

Data

I have BTC monthly price data from Glassnode and obtain year and month data from the Gregorian calendar. I use data from January 2011 until July 2022.

Information

From the BTC monthly price data, it looks like there are tops and bottoms that would perfectly correspond with long-term selling and buying opportunities, respectively.

Knowledge

It seems like those tops and bottoms occur more or less during the same months across multiple years.

Wisdom

We define two very simple trading rules that align with these opportunities:

  • Sell at the end of December if the year ends with a 0, 3 or 7.
  • Buy at the end of January if the year ends with a 1, 5 or 9.

Behold, the remarkable performance of this strategy, as depicted in the chart below. We start with 1 BTC in January 2011 and we end up with over 16.4 BTC in July 2022, beating the buy-and-hold strategy over 16.4 times. The drawdown of this strategy is -53% in BTC terms. It sharply outperforms PlanB’s strategy (9.7 BTC and -49% drawdown in BTC terms). Even though my strategy’s risk profile is a little more aggressive, the risk-adjusted returns remain higher by a wide margin.

I sincerely hope that everyone understands that these trading rules don’t make any sense and have no interpretation. And that even though it probably beats every conceivable long-term strategy, it should not be followed by anyone who is seeking to outperform a simple buy-and-hold strategy in the future. But interpretation is not the only thing missing from this type of investing approach, bringing us to the next topic.

In-sample versus out-of-sample

The article already mentions the term out-of-sample performance, but I am aware that it doesn’t immediately ring a bell with many. A quick introduction:

Consider two completely separate periods of price data, say period A and period B. We used period A to come up with some trading rules that yield amazing returns. As shown by the long-term algorithmic exogenous quant investing strategy example in the previous subsection, it is very much possible to create such rules for basically any data set with enough creativity. So showing that the rules work on period A is essentially meaningless. In order to successfully gauge the performance of a set of trading rules, they need to be applied to data points that are not the same as those that were used to create the rules. Hence, only period B can tell us how good our strategy actually is.

In this explanation, the data in period A is referred to as in-sample data and the data in period B is out-of-sample data. PlanB is well aware of the difference between the two, but unfortunately fails to mention the possible implications on his own analysis. So I will mention them here using a small example.

Example

Let’s take the liberty to remove all data from before July 2013. Then the BTC price vs RSI chart would have looked like this:

Now recall PlanB’s original sell rule that consisted of two parts: sell if i) the RSI is below 65% now AND ii) the RSI was above 90% in at least one of the last six months. Based on the new in-sample data set, this sell rule would make no sense, because we can simplify the rule and gain a lot of performance. We simply remove part i) of the rule and adjust part ii), such that our new sell rule is as follows: sell if the RSI is above 95%.

We find that this new strategy (in green) yields returns that are almost three times as high as the returns of the old strategy (in blue): 14.4 BTC vs 5.6 BTC. Moreover, the risk (as measured by drawdown) is significantly less: -30% vs -49%. The red line corresponds with the strategy that sells if the RSI is above 90%. This one yields reasonable returns, but is nowhere near optimal. Hence, for this particular data set, the RSI >95% sell strategy is definitely the way to go.

But here is the catch: how does this strategy perform if we add the out-of-sample data from January 2012* to July 2013 back in the mix. See for yourself in the graph below. Evidently, the performance before July 2013 is so bad that our new strategy just barely outperforms buy-and-hold (which equals 1 on the right axis) once we consider the full picture. This is the power of out-of-sample performance and the reason that it should always be included in strategy testing.

*I ignore data from 2011 on purpose because of the extremely high and inaccurate RSI calculations in that year, causing our new strategy to immediately sell and yield practically zero returns. Since this would be an unfair way to prove my point, I start in January 2012.

In this subsection, I showed that only the use of out-of-sample analysis can judge whether a trading strategy performs well in general. The above example shows a strategy that performs exceptionally well at first, but totally fails once we add one and a half years of out-of-sample data. And no sane investor would bet his money on it, no matter how good the in-sample performance was. In conclusion, the fact that PlanB’s strategy outperforms buy-and-hold 10 times on an in-sample basis does not say anything about how well it would actually perform overall.

Overfitting

The third and final point of criticism I will put forward is very closely related to the previous one. For a very good example on what overfitting is and how dangerous it can be, please consider this article on how it caused the Fukushima disaster. PlanB himself describes the general concept fairly well:

Overfitting happens when an algorithm memorizes a dataset (including the noise) instead of generalizing the underlying signal. Overfitting gives good model performance when making a model but poor model performance on new out-of-sample data. Preventing overfitting is an art in TA, statistics and AI.

Ironically enough, while he is familiar with the definition and apparently even considers himself an artist in preventing overfitting, he still totally fails to do so. His desperate attempt to prevent it is to use only one metric in order to keep the model as simple as possible. Sometimes that approach is sufficient, but overfitting also happens if you make your model too dependent on the data of that one particular metric. Remember when I said that his trading rules seemed a little complicated for a simple metric such as the RSI? That’s exactly where he started overfitting. Here’s why:

Sell signal

Sell if the RSI is below 65% now AND was above 90% in at least one of the last six months.

In the previous subsection on out-of-sample performance, I showed that the “below 65% part” of this sell rule is solely based on the double top in 2013. Specifically because, if we ignore the data in early 2013, we find that there is absolutely no need for that part whatsoever. If that is not memorising the data set, then I don’t know what is.

Ignoring the “below 65% part”, I would also argue that the 90% in the second part of this rule is chosen only because the single top in March 2021 just barely missed out on exceeding 95% just like all other tops. Let’s say we didn’t have data from March 2021 onwards and created trading rules. We would have definitely chosen the 95% RSI barrier, because its performance would have been two times better. But nowadays we know that that would have caused us to miss the selling opportunity in March 2021. So we choose 90% instead of 95% based on one observation only. What’s that called? Oh yeah, overfitting.

Buy signal

Buy if the RSI is more than 2% higher than the low AND was below 50% in at least one of the last six months.

Now consider the following statement made in the article:

Traditionally, RSI above 70 indicates an overbought situation and RSI below 30 indicates an oversold condition. However, the BTC range is different because BTC RSI can go as high as 90–100 and has never been lower than 40.

We now know that the RSI of traditional assets can go below 30%. And we are already in buying territory if the Bitcoin RSI is at 50%? It seems foolish to think that an RSI around 30% will not happen in the coming years, just because in the one case of Bitcoin it didn’t happen yet. We have also never seen the Bitcoin price exceed $70k, but many of us still think the chance of this happening eventually is quite high, right? Once again, we’re memorising the data instead of thinking about what it actually means.

Conclusion

In this article, I reviewed PlanB’s strategy that uses the Relative Strength Index to construct trading rules that would have yielded outstanding returns in hindsight. And based on the interpretation of the trading rules, lack of out-of-sample analysis, and the obvious presence of overfitting, I can undoubtedly state that this strategy should not be considered a useful tool to possibly outperform a simple buy-and-hold strategy in the future. Sure, there is a reasonable chance that it can outperform, but not by virtue of the model’s logic.

Luckily, PlanB appears to be prepared for the case in which his strategy fails or is debunked. He introduces a (fairly substantial) list of disclaimers corresponding with his four steps:

Data: the data could contain errors. Sure, but since we only use price data we should be fine. It is probably the most accurate data in crypto.

Information: the calculation of RSI could be wrong. Yes, which is why quants should always perform their own RSI calculation.

Knowledge: the correlation between RSI and BTC could be spurious. Impossible, the RSI is based on price data and nothing else. By construction, there is correlation between the two. Fact.

Wisdom: overfitting of the trading rule, the backtest could be wrong, there could be black swans and past performance is no guarantee of future results. It is the responsibility of the quant to eliminate the risks of overfitting and incorrect backtesting. This basically says “I may not have done my job correctly”. The black swans part is true, but only for actual black swans. Lastly, the past performance part aligns with my points, but should have been included in the analysis instead of the disclaimers.

Appendix

I used the following code to calculate the Relative Strength Index:

If you want to review my backtesting method, I’ve created a Python class specifically for the purpose of this article.

--

--

Tim Stolte
Amdax Asset Management

Quantitative Researcher at Amdax. Master’s degree in Econometrics / Quantitative Finance.