Backtesting Biases and How To avoid them

Auquan
auquan
Published in
3 min readJan 31, 2017

In God we trust. All others must bring data.

Data forms the spine of backtesting and is of prime importance to data scientists and trading coders. For the uninitiated, backtesting is the process of simulating a trading strategy with the help of historical data. Primarily used by data scientists and hedge funds, backtesting simplifies the process of assessing the functionality of a data strategy by testing and rejecting trading strategy ideas.

However, backtesting is not without its shortcomings. Most often, there is a distortion in the strategy between what is simulated and what goes in live trading. Backtesting takes multiple things into considerations like mathematics, statistics, psychology, and more. Yet, it is exposed to pitfalls because of the intrusion of biases in simulation.

As a compilation, we’ve listed some of the most common backtesting biases that creep in and strategies on how you can avoid them.

Optimization Bias

The closest and the simplest way to explain this would be to take a little assistance from Murphy’s Law. According to the law, if something has to go wrong, it will. We cannot be too prepared for a shortcoming. The same causes optimization bias. Also known as data snooping bias, this happens when you add too many parameters to your algorithm and fine-tune it with the available data. What happens here is you’re only testing an algorithm with concerns that had happened and not what could happen.

The best way to avoid this bias is to keep your simulation system as simple as possible. Keep fewer parameters and simulate your algorithm across diverse markets and time periods. Also, once you’re done with backtesting, it’s recommended that you run the algorithm through new, unfamiliar data to ensure the system’s authenticity and effectiveness.

Look-ahead Bias

Since you’ve access to an entire data set, the human mind tends to overlook the fact that you sometimes use future or hypothetical information in a backtest. When you backtest on the same data set, you are more likely to involuntarily introduce a look-ahead bias into the system. The concerns in a look-ahead bias can be as subtle as technical bugs or maximal and minimal values. It directly influences the results of live trading and it’s important that you avoid it. This can be done if both live trading and backtesting are done using the same algorithm or code because when the code attempts to look-ahead, the program crashes.

Survivorship Bias

This is another bias that coders or data scientists overlook. When you use a stock-database that exists today for a backtest, you consider only the stocks that are available or rather alive at this point of time. What you’re missing out is the stocks that are no longer listed. As you consider only the stocks that have survived, ignoring the delisted ones, it’s rightly called survivorship bias.

Consider a strategy that looks at stocks in S&P 500 and wants to beat the returns of the index. if you backtest using stocks that make up the index at present, your backtest suffers from a survivorship bias. Survivorship biases can cause serious consequences in live trading, however, they can be minimized by purchasing databases that feature delisted stocks as well. You can also avoid this by using more recent data to your backtests.

Neglecting Market Impacts

The data history that you use does not include your trades. So, when you trade live, the simulation does not really forecast the price you’re most likely to get when you trade. Since trading and pricing go hand in hand, neglecting the market impact causes a bias that will influence your backtesting result. A simple fix to this bias is to always anticipate that when you trade, the prices will be against you. This presumption, while conservative, eliminates the bias, yielding you more accurate results.

Most importantly, you can avoid fallacies in backtesting when you change your views about it. If the purpose of your backtesting is to assess the accuracy or the efficiency of your strategy, be sure that they aren’t like how you assume. Start looking at backtesting as a filtration process of eliminating strategies — be strict in it — and you’ll have strategies that are more accurate and negligibly biased. Good luck!

Originally published at auquan.com on January 31, 2017.

--

--

Auquan
auquan

Building Tools and Platform to solve finance problems using Data Science