How to Prevent Overfitting in Backtesting

Sam Quantman
OpenHarbor
Published in
3 min readMay 27, 2024

Preventing overfitting in quant strategy backtesting is a critical task. It involves finding the right balance between simplicity and flexibility in model development.

Overfitting is a serious yet very common problem in quant strategy backtesting, where a model is excessively tuned to the historical data. This can lead to the model performing poorly when applied to the real market. The primary challenge lies in finding a balance between simplicity and flexibility in model development to ensure that it generalizes the crypto market.

In a hypothetical scenario, a quant trader develops a strategy that involves buying and selling assets based on a set of technical indicators. The strategy performs exceptionally well in backtesting, delivering high returns and low drawdowns. Excited by the results, the trader decides to implement the strategy in live trading.

However, once the strategy is put into action, it fails to perform as expected. The returns are significantly lower than in the backtest, and the drawdowns are much higher. The trader realizes that the strategy was overfit to the historical data, and it did not generalize well to new data. This case study illustrates the importance of preventing overfitting when backtesting.

The implications of overfitting are significant. It can result in strategies that are overly complex and fail to capture the underlying patterns in the data. This can lead to poor performance in real world scenarios, resulting in financial losses. Therefore, it’s crucial to employ methods to prevent overfitting when backtesting a quant strategy. The followings are the key practices that we can arm with.

Cross Validation

This involves splitting the data into training and testing sets for various folds. The model is developed using each training set and then evaluated on the corresponding testing set. This can help to ensure that the model is not just a product of the specific data set it was trained on, but can also generalize to new data.

For instance, OpenHarbor trained its momentum strategy with bitcoin(BTCUSDT pair), and then tested it with other cryptocurrencies such as ethereum(ETHUSDT pair) for different periods to cross validate.

A Momentum Backtest on ETHUSDT after Training with BTCUSDT

Simplicity

The more complex the model, the more likely it is to overfit. Therefore, it’s beneficial to keep the model as simple as possible, while still being able to capture the underlying patterns in the data. This can be fulfilled by selecting only relevant features and limiting the number of parameters. Therefore, OpenHarbor selects its momentum indicators very carefully, and keeps just a few of them that are consistently significant.

Regularization

This technique involves adding a penalty term to the cost function in order to discourage the model structure from becoming too complex. This can help to prevent overfitting and improve the model’s ability to generalize to new data. As a result, the momentum strategy in OpenHarbor utilizes only the first degree of indicators in order to avoid unnecessarily complex structure possibly resulting in overfitting.

In conclusion, preventing overfitting in quant strategy backtesting is a critical task. It involves finding the right balance between simplicity and flexibility in model development. By employing strategies such as cross validation, simplicity and regularization, it’s possible to develop models that can generalize well to the crypto market. This is essential for the success of quant strategies in real world trading.

--

--