Fooled by Randomness, Over-fitting And Selection Bias

There are software programs that allow combining technical indicators with exit conditions for the purpose of designing trading strategies that fulfill desired performance criteria and risk/reward objectives. Due to data-mining bias it is very difficult to differentiate the random strategies from those that may possess some intelligence in pairing their trades with the market returns.

Suppose that you have such a program and you want to use it to develop a strategy for trading SPY ETF. After a number of iterations, manual or automatic, you get a relatively nice equity curve in the in-sample and in the out-of-sample with a total of about 1000 trades (horizontal axis):

Before obtaining the above equity curve, many combinations of entry and exit methods were tried, usually hundreds or thousands, and in some cases billions or even trillions. A large number of equity curves were generated that were not acceptable. You may think that this is a good equity curve but you also suspect the strategy may be random due to data-mining bias arising from multiple comparisons. Below is another example where the developer thinks that by increasing the number of trades, randomness is minimized.

In this example the number of trades was increased by two orders of magnitude to about 100,000 and both the in-sample and out-of-sample performance look acceptable. Does this mean that in this case the strategy has lower probability to be random?

The answer is, no. Both of the above equity curves were actually generated by tossing a coin with a payout equal to +1 for heads and -1 for tails. The second equity curve was generated after only a few simulations. Both curves are random. You can try the simulation yourself and see how successive random runs can at some point generate nice looking equity curves by luck alone.

The correct logic here is that random processes can generate nice looking equity curves but how can we know if a nice looking equity curve selected from a group of other not so nice looking curves actually represents a random process and the underline algorithm has no intelligence? This inverse question is much more difficult to answer.

The coin toss experiment illustrates how when one uses a process that generates many equity curves, some acceptable and some unacceptable, one may get fooled by randomness. Minimizing data-mining bias that arises from over-fitting, data snooping and selection bias is a complex and involved process that for the most part falls outside the capabilities of the average developer who lacks an in-depth understanding of these issues. Actually, the methods for analyzing strategies for the presence of bias are in many cases more important that the methods used to generate them and are considered an integral part of a trading edge.

Often, Monte Carlo simulation is employed in an effort to differentiate random from robust strategies. Note that this use of Monte Carlo simulation is inappropriate and generates misleading results. For more details see this article. In a nutshell, any validation method that becomes part of a data-mining process loses its effectiveness due to data snooping. For more details about data-mining bias and its components see this article.

Click here for suggestions about reducing data-mining bias.

This article originally published in Price Action Lab Blog.

If you have any questions or comments, happy to connect on Twitter:@mikeharrisNY

Disclaimer: No part of this article constitutes a trade recommendation. The past performance of any trading system or methodology is not necessarily indicative of future results. Read the full disclaimer here.

About the author: Michael Harris is a trader and best selling author. He is also the developer of the first commercial software for identifying parameter-less patterns in price action 17 years ago. In the last seven years he has worked on the development of DLPAL, a software program that can be used to identify short-term anomalies in market data for use with fixed and machine learning models. Click here for more.