Internal Racing Conditions in Crypto Exchanges

Published in

Open Crypto Trading Initiative

4 min readMay 16, 2023

Greetings, Ladies and Gentlemen! In two of our previous publications entitled “The Private Information Edge: An Interesting Tale in High Frequency Trading” and “The Cancel-Order Failure: A Bad Headache for Market Makers”, we provided detailed discussions about how market makers could obtain private information edge and what techniques they could use to mitigate adverse selections. Those issues are related to the fact that high frequency trading (HFT) happens in highly asynchronous environments where information changes from moment to moment. These lightning changes happen on an HTF trader’s trading servers, and they also happen on the exchange’s trading servers. Quite often the traders have full control and visibility of their own servers, and can therefore construct an accurate picture about the sequence of events happening on their own servers. However, quite often we rarely have any clue about how the exchange’s servers operate: what the exact sequence of events are, how the different components interact with each other, what the latencies are between different components, etc. All of these might affect our trading logic and timing.

To begin with our discussions, let us first have a brief review on the general architecture of the exchange server. When our request for creating or canceling an order first arrives at the exchange server, the exchange needs to perform preliminary checks for its validity (e.g. price is formatted correctly, signature is valid, etc.). If it passes, then the exchange’s risk engine needs to perform balance/collateral check against the account (or subaccount) the order is associated with if the request is to create an order (such check isn’t needed if the request is to cancel an order). If it passes, then the exchange pushes the request into a First-In-First-Out (FIFO) queue. Then the request moves towards to end of the FIFO queue until it reaches the end. Then the request is pulled out and the exchange’s matching engine matches its order against the order book if the request is to create an order or removes its order from the order book if the request is to cancel an order. Then the exchange broadcasts messages to the traders.

General Overview of Exchange Server Architecture

Not so complicated if all things involved happen in a perfectly linear fashion. However, when some details on the exchange server happen in an asynchronous fashion, racing conditions might occur and they might lead to some inconveniences to the traders. We have found that a reasonable way of “studying” the exchange server’s potential racing conditions is to swiftly create and cancel orders in a well-defined pattern and observe any “abnormal” responses that might come back from the exchange. For example, for a given exchange, suppose that we have 0.001 BTC in our account balance. Try to programmatically create a sell maker order and with a quantity of 0.001 (yes, all in), wait for a few seconds (this way we fully respect the exchange’s API rate limit), cancel the order, wait until the websocket data feed confirms that the order has already been canceled (not just an acknowledgement, we need a solid confirmation), then immediately send another creating sell maker order request with the same quantity. Repeat. In short, we try to test what might go wrong with the HFT scenario that once one order is canceled we immediately submit another order. Notice that such a scenario is quite common in HFT. Ideally our experiment should see no errors and the process just goes on forever. However, at the time of writing, when we performed this experiment on Kucoin, we found that there was a small but observable chance that after we received a websocket confirmation telling us that the order has already been canceled, the new order placement failed with an error message of insufficient balance. This means that there was a racing condition among the different components of the exchange server such that after the matching engine removed an open order from the order book, it wasn’t fast enough to tell the risk engine that the locked balance due to that open order should be released to the available balance, which caused the risk engine to reject our next creating order request. Although an easy workaround can be adding a small delay before creating a new order after an old order is canceled, but it certainly adds some inconvenience to HFT traders because now it is nebulous to know whether the insufficient balance error message is a real one or a spurious one.

If you are a curious reader, you might wonder what might happen if we perform experiments of swiftly creating/canceling orders in other well-defined patterns? Are there any phenomena that could possibly be taken advantage of? We will leave this question to one of our future articles. If you are interested in our work or collaborating with us, join us on Discord: https://discord.gg/b5EKcp9s8T and find us on Github: https://github.com/crypto-chassis/ccapi 🎉. We specialize in market data collection, high speed trading system, infrastructure optimization, and proprietary market making.

Disclaimer: This is an educational article rather than investment/financial advice.

Internal Racing Conditions in Crypto Exchanges

Written by Crypto Chassis