High Frequency Trading in Bitcoin Exchanges

Introduction(FMZ)

In this post I analyze the presence and activity of high frequency trading in a Bitcoin exchange. Since to date this markets are extremely unregulated, such behaviour takes places with little to no constraint. I show how over 99% of orders placed are not meant to be filled, but instead to distort the perception of the market. In addition, I try to spot common HFT strategies, such as Quote Spoofing, Layering and Momentum ignition. Given the anonymous nature of these exchanges these last results are to some extent subjective.(FMZ)

What is High Frequency Trading?

From Wikipedia [1], High-frequency trading (HFT) is a type of algorithmic trading characterized by high speeds, high turnover rates, and high order-to-trade ratios that leverages high-frequency financial data and electronic trading tools.

Methodology(FMZ)

This analysis has been carried out with order data from the Websocket stream from GDAX, a US based digital asset exchange [2] owned by Coinbase. It is one of the largest markets (over 42 MM USD/day) [3] and it exposes a high performance socket where all orders are broadcasted. In addition, it offers some interesting features for data analysis:

Orders are timestamped (as opposed to Bitfinex, for example)

It has millisecond granularity (again, as opposed to Bitfinex)

It says whether an order has been matched or cancelled -one could argue that disappearing orders far from the bid/ask spread must have been cancelled (and it’s true), but for orders inside the spread, this information is necessary.

While data has been captured for several days (at the time of this post I’m still capturing data), for the following analysis only data from July 21, 2017 has been taken. Mind you, there are still over 2 Million datapoints. (FMZ)

Since the GDAX feed does not explicitly keep information of the current best bid/ask, a little preprocessing is needed. The best bid is the highest price for currently open BUY orders, while the best ask is the lowest price for open SELL orders. Although this calculation is not complicated nor particularly slow, it’s better to explicitly append the current best bid/ask as additional columns. No further preprocessing has been carried out.

Related work

While writing this article, I came across a blog post from Philip Stubbings at Parasec [4], who made a similar analysis in 2014. While the amount of data differs by orders of magnitude, the findings are the same, especially concerning flashing orders. Quoting from his site:

I collected order book event data from the Bitstamp exchange over a 4 month period, between July and October (2014), resulting in a dataset of ~33 million individual events; A minuscule dataset in comparison to the throughput on “conventional” exchanges, see (Nanex: High Frequency Quote Spam) for example.

While the event dataset consists of ~33 million events, these events can be broken down into individual orders and their types. In total, of the identifiable order types, there were 14,619,019 individual “flashed orders” (orders added and later deleted without being hit) representing 93% of all order book activity, 707,113 “resting orders” (orders added and not deleted unless hit) and 455,825 “marketable orders” (orders that crossed the book resulting in 1 or more reported trades).

As we’ll soon see in this report, I recorded 2,169,450 events in less than one day. That means, the number of events per unit of time is 8 times bigger than in 2014. Flash orders are still a majority, representing over 99% of all order book activity. (FMZ)

HTF Strategies (FMZ)

The Bocconi Students Investment Club (BSIC) [5] describes some strategies which the HFT traders use to distort the perception of the market. For this post I’ll focus on Spoofing, Layering and Momentum Ignition.

Spoofing & Layering

Quoting from BSIC [5]:

Spoofing is a strategy whereby one places limit orders, and removes them before they are executed. By spoofing limit orders, perpetrators hope to distort other trader’s perceptions of market demand and supply. As an example, a large bid limit order could be placed with the intention of being canceled before it is executed. The spoofer would then seek to benefit from prices rising as the result of false optimism others would see in the market structure.

Detection

There is evidence of high frequency spoofing on July 21, 2017 between 09:45:52 and 09:45:56. Let’s take a look at the order book. Red points are SELL orders (3 BTC @ $2741.99), vertical grey lines are cancellations and the blue and green lines are bid and ask price, respectively.

One interesting thing is that neither the bid or ask price moves.

Also from [5]:

More controversial has been the act of layering which carries many similarities to outright spoofing, but differs in that orders are placed evenly across prices with the goal of reserving an early execution priority at each given price level. If the person has no trade to execute at that price point the orders are simply removed. Despite being more benign in nature, the act of layering also distorts market demand and supply perception. (FMZ)

It seems to be evidence of layering. Let’s take a closer look at the minute between July 21, 2017 between 09:41:00 and 9:42:00. Orders seem to push the ASK level downwards, eventually decreasing the BID price. Next, BUY orders are placed at this lowered level, to be sold when the BID price recovers.

Momentum ignition(fmz.com)

Still quoting [5]

Momentum ignition is a strategy in which a trader aims to cause a sharp movement in the price of a stock by using a series of trades, which indicate patterns for high frequency traders, with the motive of attracting other algorithm traders to also trade that stock. The instigator of the whole process knows that after the somewhat “artificially created” rapid price movement, the price reverts to normal and thus the trader profits by taking a position early on and eventually trading out before it fizzles out.

To detect momentum ignition, it is important to focus on the following three main characteristics as shown in the chart below:

Stable prices and a spike in volume

A large price movement compared to the intraday volatility

Reversion to the starting price under a lower volume

The following picture from zerohedge and Credit Suisse AES Analysis illustrates this behavior. (FMZ)

Conclusion (FMZ)

According to an interview carried out by The Atlantic [6] to Michael Kearns of the University of Pennsylvania and Andrew Lo at MIT, this behaviour also happens in traditional trading, and its causes are still matter of dispute. Relevant extract:

[…] why would a firm engage in this behavior? Lo and Kearns offered a few theories of their own about what could be happening.

To be honest, we can’t come up with a good reason,” Kearns said. What’s particularly difficult to explain is how diverse and prevalent the patterns are. If algorithmic traders are simply testing new bots out — which isn’t a bad explanation — it doesn’t seem plausible that they’d do it so often. Alternatively, one could imagine the patterns are generated by some set of systemic information processing mistakes, but then it might be difficult to explain the variety of the patterns.

“It’s possible that the observed patterns are not malicious, in error, or for testing, but for information-gathering,” Kearns observed. “One could easily imagine a HFT shop wanting to regularly examine (e.g.) the latency they experienced from the different exchanges under different conditions, including conditions involving high order volume, rapid changes in prices and volumes, etc. And one might want such information not just when getting started, but on a regular basis, since latency and other exchange properties might well be expected to change over time, exhibit seasonality of various kind, etc. The super-HFT groups might even make co-location decisions based on such benchmarks.”

References

[1]https://en.wikipedia.org/wiki/Bi...

[2] https://www.gdax.com/

[3] Source: https://coinmarketcap.com

[4] http://parasec.net/blog/order-bo...

[5] http://www.bsic.it/marketmanipul...

[6] https://www.theatlantic.com/tech...

[7] https://docs.gdax.com/