Backtesting a Sports Betting Strategy

Assessing the past to predict future profits

Estèphe
Systematic Sports
5 min readOct 27, 2023

--

Hedge funds employ backtesting to trade billions of dollars. We’ve done it for the sports markets.

So, how do you backtest a strategy? And how should you interpret the results?

What is Backtesting?

Backtesting is the process of testing a trading strategy using historical data. It helps quants understand how a strategy would have performed in the past, allowing them to validate and refine their strategies before live trading.

Ted Jiang in the big short, the quant
Ted Jiang — the quant from The Big Short (source)

We have built the data, infrastructure and software for backtesting at Systematic Sports to enable our quant research and improve our betting strategies.

Historical Data

First, ensure you have the data. At Systematic Sports, we have onboarded a decade of historical football data into our data lake, starting from the 2014/2015 season.

Building your Engine

A backtesting engine is a software that combines your historical data with trading logic to simulate how your strategy would have performed.

Systematic Sports’ backtesting engine works as follows:

1. Historic strategy bets are passed to the engine

Fixture, Team, Market, Stake
Schema of data passed into the engine.

These bets are calculated by passing our historical data to the betting algorithm to simulate the bets that would have been taken in the past. Additionally, a starting bankroll is specified (default $1000).

2. Retrieve historical odds

The engine maps these bets to the historical odds that were available then. Our engine simulates these odds by taking each game's best odds available at kick-off.

3. Calculates each bet’s PnL

The engine evaluates each bet’s outcome by referring to the fixture’s result. It then calculates the bet’s “Profit and Loss” or “PnL”, accounting for exchange trading costs if the betting venue is Betfair Exchange. Betfair Exchange charges a 5% commission on winnings; this significantly impacts your betting returns and can’t be ignored in a reliable backtest.

Bet won: PnL = (Bookie Odd -1) * Stake * (1 - exchange comission rate)

Bet lost: PnL = — Stake

Bet void: PnL = 0

4. Generate a continuous time series of the strategy’s returns

On each historic day of our backtest, each bet’s PnL is cumulatively summed to our current bankroll. The return can be plotted to view the strategy’s performance.

A graph showing the backtesting performance of Systematic Sports betting strategy
The backtested performance of our flagship Value Model.

5. Calculate performance metrics for the strategy

Performance metrics evaluate a strategy’s key characteristics:

  • Return
  • Risk

Returns indicate the profit potential. Risk gauges the potential for losses and the strategy’s volatility. Investors make informed decisions by analysing both, balancing their desire for profit and risk tolerance.

A collection of metrics for the backtesting strategy. Annualised returns, sharpe, volatility and max drawdown alongside returns.
Backtested performance metrics of our flagship Value Model.

We will explain these metrics in detail in a separate article.

Understanding your Backtest

A backtest helps you understand the strengths and weaknesses of your strategy.

Risk management is crucial to protecting your strategy from randomness and large drawdowns. A high volatility backtest indicates that your strategy takes on a lot of risk. Risk can be controlled by reducing the size of your stakes or the odds of bets taken.

The feedback from your backtest informs future improvements for your strategy. Systematic Sports models have been refined through backtests to:

  • Control the minimum required probability of a bet.
  • Control exposure to single fixtures.
  • While still picking high-value bets.

Our backtesting engine will continue to inform the research we do to improve our strategies.

Pitfalls of Backtesting

“Theory will only take you so far” — Oppenheimer

1. Overfitting

Imagine predicting someone’s outfit based on a week’s observation. On a rainy week, the subject wore a raincoat every day. You may conclude that they wear raincoats every day. If you base future predictions on this, you’d be wrong often.

Overfitting a dataset.

Overfitting happens when a model is trained too closely on a specific dataset. It’s like memorising the answers for a test instead of understanding the subject.

This is mitigated by ensuring a large dataset and varying market environments to backtest on (e.g. multiple leagues and seasons). The more parameters your strategy has, the easier it is to overfit. Limit the number of parameters and make sure each has a logical justification.

2. Data Quality

Garbage in, garbage out. Inaccurate or incomplete datasets can lead to misleading outcomes. Cleanse and validate your data before backtesting.

3. Simulating Live Trading

Remember, backtesting is a simulation. Even accounting for exchange costs, live trading will differ from the results simulated on a backtest, this is known as “slippage”.

Thomas Shelby from the peaky blinders looking happy about backtesting before looking sad about live trading

Large bets have a tangible market impact, consuming liquidity and moving the price as the bet is placed. Backtests often assume idealised trade execution at the last known price, which can lead to overly optimistic results.

Summary

Backtesting provides valuable insights about your strategy and should be used as one of the tools in your research and decision-making.

--

--