Backtest a Market Making Strategy: An Event-Driven Approach

Photo by M. B. M. on Unsplash

One of the important gaps that we have identified during our extensive research period about the current state of available tools for high-frequency crypto trading firms is the lack of a transparent, fast, and highly customizable backtest engine. In particular we were unable to find any ready-to-use open-source package for backtesting a market making strategy which is the most common strategy class used in high frequency trading. Therefore in response we’ve built one: https://github.com/crypto-chassis/ccapi/tree/v5.9.2#spot-market-making-beta. It is open source so that you can inspect each and every detail. It is lightning fast so that you can run it with as many different conditions as you want. It is easy to change so that you can even completely rewrite it at your will. In this article, we will explain the core technique that was used in our backtest engine: the event-driven programming model. The beauty of this technique allows the backtest engine to: A. Be created in ~150 lines of code. B. Run at jaw-dropping speed. C. Accommodate for arbitrarily high frequency algorithm. This technique is also highly relevant if your algorithm is a medium/low frequency one based on classical technical indicators, but you want to perform backtesting on vast amount of historical data or execute exhaustive grid search on a ton of parameter combinations or your algorithm itself is very CPU-intensive. All in all, the enormous speed that an event-driven approach brings to you can be extremely helpful.

Based on our knowledge and experience, a lot of automated trading bots make use of what we call a time-driven programming model. In such a program, the code relies on the real clock time in order to do its job. For example, if the code invokes some sort of sleep function which puts the current thread into sleep for some specified amount of time or the code sets a timer which will execute some code at a specified future moment or the code has an explicit dependency on the current time by invoking some sort of now function, then such code is time-driven. The advantage of a time-driven programming model is that it is easier to understand and reason about because it is linear and homogeneous in time. Decisions are made based on what information is available at a given time point. Coding-wise the overall logics resembles a for-loop that iterates over time points:

for (time = initial; time not ending; time = next_time ) {
do something
}

The problem is that a time point itself doesn’t carry any useful information for making a trade decision, instead we’d have to look at the actual data up to this time point such as the current and past order book states, the latest and recent trades, etc. These are the information that we need to decide whether to create new orders or modify/cancel existing orders. For a backtest engine, this data lookup step translates into a query from some data source like an in-memory cache, a relational database, etc. And this step needs to happen for each and every time point. In fact, this step is one of the major overheads for a general-purpose backtest framework. When this step needs to happen billions of times, speed becomes very important. Later on from some simple numbers we can see that each second quickly adds up. And here comes the event-driven programming model to the rescue. Coding-wise the overall logics resembles a for-loop that iterates over events:

for (event = initial; next_event exists; event = next_event ) {
do something
}

In the context of trading, an event can be a public trade, a private trade, an order book state change, etc. In general such a data event would carry relevant information for making a trade decision. The advantage of an event-driven programming model is that it completely eliminates the data lookup overhead. What it boils downs to is that we need to load the data events, arrange them in chronological order, and simply iterate over them one-by-one. Thank God that trading data are time series and are naturally stored in chronological order, the task for an event-driven programming model then boils down to loading the data events and simply iterate over them one-by-one (the expensive “arrange them in chronological order” step is gone!). And this is exactly what we did in our backtest engine for the market making strategy! It takes about 1~2 seconds to perform backtesting on one day’s of historical market data for one exchange (e.g. coinbase) and one instrument (e.g. BTC-USD). Professional firms trade on all major exchanges and major instruments. Each exchange-instrument combo requires at least two years of historical market data and there might be a grid search having 100 combos on parameter values. This translates to 2 days for just one exchange-instrument combo. Time quickly adds up! Realizing that the speed of a backtest engine is of vital importance from a high frequency trading perspective, we’ve designed and coded our backtest engine with an event-driven approach. Furthermore, we’ve unified the engine for living trading mode, paper trading mode, and backtest mode under a single umbrella, all in about 600 lines of code: https://github.com/crypto-chassis/ccapi/blob/v5.9.2/app/include/app/spot_market_making_event_handler.h.

Feel free to play with our backtest engine by following the short instructions in https://github.com/crypto-chassis/ccapi/blob/v5.9.2/app/src/spot_market_making/config.env.example. Hope that backtesting can bring you some good luck! Stay tuned on our upcoming article about optimization of a market making strategy. If you are interested in our work or collaborating with us, join us on Discord https://discord.gg/b5EKcp9s8T. 🎉

Disclaimer: This is an educational post rather than investment/financial advice.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store