Fake volumes in cryptocurrency markets — February report

Mechanics of wash trading behind the scenes

The share of detected artificial (fake) trades and trading volume

88% of crypto trading volume in February 2019 is allegedly inflated. A new report, published by the data science team of Crypto Integrity, has identified wide-spread malicious practices across top exchanges: OKex, Bit-Z, ZB.COM, CoinBene, LBank, Huobi Global, BW, HitBTC, IDAX, BitMart, CoinTiger. The analysis of order books and trades of selected liquid symbols has shown that on average per symbol 86% of trades and 88% of the volume have been classified as artificial (HitBTC and Huobi excluded). On some symbols the detected fake volume has reached 100% (the details and the methodology are below).

The article covers the following sections:

  1. Market overview
  2. Impact of trading volume on price
  3. Market manipulation practices
  4. Market data analysis by exchanges with charts
  5. Liquidity analysis
  6. Applied methodology
  7. Data samples
  8. Known issues

1. The market

Top crypto exchanges by 30d trading volume (February 2019) — CoinMarketCap

There were 241 exchanges by CMC, although the actual number was higher. Out of 241, there were 27 exchanges with 0 adjusted volume and $93.5 billion reported volume (21% of total reported exchange traded volume). These exchanges are mostly futures exchanges or those with zero no fees (BitMEX, Bithumb, Cobinhood are among the others).

Total 30-day volume (%): $441,557,469,457 (100%)
Excluded (%): $93,511,692,648 (21%)
Adjusted (%): $348,045,776,809 (79%)
Researched (%): $118,802,880,621 (27% of total, 34% of adjusted)

This research covers 11 exchanges from the list of leading exchanges ranked by adjusted 30-day volume (as of 28-Feb-2019) by CoinMarketCap. These exchanges counted for 33% of the total adjusted volume in February 2019; our analysis focuses on several most liquid symbols of each exchange. It is noteworthy that none of these exchanges supports a fiat currency. Our sample is not supposed to be representative, as we intentionally focused on the exchanges with suspicious trading activity, as indicated by BTI, CER and other researchers. However, it gives a good insight into the scale of wash trading in cryptomarkets.

Trading volume in February 2019 of all exchanges with non-zero adjusted volumes — CoinMarketCap
Trading volume in February 2019 of top-50 exchanges with non-zero adjusted volumes — CoinMarketCap. Estimated true trading volume — Crypto Integrity; HiBTC and Huobi Global are not evaluated.

Please note that OTC market is not covered in this research.

2. Why should I care?

The true trading volume is an indicator of the intrinsic price of an asset. The change in trading volume may as well be a leading indicator of an asset’s future price. Theoretically, market capitalization and price of a cryptocurrency (similar to those of stocks or commodities) depend on the balance of demand and supply. The demand, in turn, depends on the expected investment value (how many people will buy it and hold) and utility value (how many people will use it). In a word, it is about the interest in the technology or utility of a coin and the estimated mass adoption among the retail as well as institutional investors. The higher the interest is, the higher the trading volumes are. Further, the trading volume depends on true liquidity present in the order books among other factors — this is especially important for institutional investors and traders. Mathematically, it may be formulated as follows:

Price = f(V, p1,…pN), where V is volume, p is some other parameter;
V = f(L, k1,…kN), where L is liquidity, k is some other parameter.

All in all, to make a balanced investment decision one needs a true understanding of the liquidity and trading volume of an asset. While the traditional financial markets have a number of tools to prevent distortion of market data (MD), the cryptocurrency markets are still vulnerable to malicious practices that mislead investors and traders.

3. Market manipulation practices

There is a number of malicious trading practices that are prohibited in regulated financial markets. The most common cases are wash trading and disruptive practices such as spoofing, flipping, quote stuffing etc. In this research, we will focus on wash trading as the most prevailing market abuse in the crypto world. We do plan to cover other topics in our future papers.

Wash trade misleads market participants by artificially increasing trading volume, giving the impression that the instrument is more in demand than it actually is — The Audiopedia

We have already written about beneficiaries of wash trades in our article Wash trade in cryptomarkets — a case of BW exchange. The outstanding question is how wash trading is implemented by fraudulent exchanges in practice? We identify 3 main mechanisms — from the easiest to implement (and to detect) to the hardest:

  1. in-spread trades w/o limit orders
  2. in-spread trades with short-lived limit orders
  3. trades near bid-ask caused by short-lived limit orders

In-spread trades without limit orders. This practice is the most popular among new unsophisticated exchanges because of its simplicity and riskless. A fraudulent exchange reports a trade while there are no changes in the order book at all. This practice is riskless because there are no limit orders in the order book even for fractions of a second, thus, other participants cannot interact with these orders. It is only the exchange itself that can maintain such trading activity.

In-spread trades with short-lived limit orders. No one is forbidden from placing a limit order at the mid-market price. Such orders make the spread narrower and add liquidity. In contrast, this malicious practice implies only orders that are present in the order book for just milliseconds so that a human being is unable not only to hit them but also to see them. Strictly speaking, not all short-lived orders are malicious. There are algotraders and arbitrageurs who monitor all incoming orders and quickly hit those that provide them with a trading opportunity. However, the dominance of trades caused by such orders cannot be justified.

Trades near bid-ask caused by short-lived limit orders. In order to make the trading flow more organic, advanced wash trading bots have been tuned in a way to produce SELL trades near the best bid and BUY trades near the best ask. Remember that wash traders have no intention to actually trade with other market participants. In order to minimize this risk, they place limit orders for fractions of a second and then hit them with aggressive ones. Of course, a limit order of 0.5 BTC hit by a market order of 0.5 BTC after 50 milliseconds can be easily classified as a wash trade. Thus, the scammers try to add noise:

  • place limit orders of different sizes
  • place aggressive orders of different sizes (both more and less than the size of a resting order)
  • place limit orders with different life duration
  • place aggressive orders with different life duration (applicable in cases where the size of an aggressive order is bigger than the size of a resting order so that the aggressive order is partially filled and becomes a new resting order) etc.

4. Exchanges

The following charts show some examples of the suspicious trading activity on the selected exchanges with a short comment. The identified mechanisms of producing suspicious trading activity are summed up in a table below. There are certain issues with the market data from Huobi and HitBTC (read more at the end of the article), which have forced us to exclude these two exchanges.

Identified mechanisms of suspicious trading activity — Crypto Integrity

OKEx

OKEx. The algorithm has detected in-spread trades in some symbols; the more thorough analysis of short-lived orders is required.

Bit-Z

Bit-Z. Systematic trades in-spread without any change in bid-ask spreads. Unstable bid and ask prices signals low liquidity.

ZB.COM

ZB.COM. Trades in-spread, as well as trades out-of-spread, have been detected along with decent ones.

CoinBene

CoinBene. Wide bid-ask spread with short-lived limit orders being systematically placed in-spread.

LBank

LBank. In-spread trades dominate at random prices within the spread; SELL trades are at higher prices than BUY trades.

Huobi Global

Huobi Global. The data is inconclusive due to the market data snapshots every 1–3 seconds, which do not allow restoring true market history.

BW

BW. Systematic buy&sell in-spread trades at the mid-market price

HitBTC

HitBTC. Even in symbols with the very narrow bid-ask spread, there are in-spread trades detected

IDAX

IDAX. Wide bid-ask spread with short-lived limit orders being systematically placed in-spread; LTC/USDT shows another pattern without limit orders

BitMart

BitMart. Similar to IDAX, there are two detected patterns: in-spread trades with (e.g., ETH/USDT) and without limit orders (e.g. BTC/USDT).

CoinTiger

CoinTiger. Buy&Sell in-spread trades around the mid-market price.

5. Liquidity

The system calculates 3 liquidity metrics.

Handy liquidity is the cumulative volume in an order book at levels remote from the naive mid-market price by 0.5% or less. More liquid assets have higher handy liquidity.

Bid-ask spread, %, is calculated as:

(best ask - best bid) / [(best ask + best bid) / 2] * 100.

More liquid assets have narrower bid-ask spreads.

Bid-ask spread by 10 BTC, %, is calculated as:

(ask - bid) / [(ask + bid) / 2] * 100,

where ask and bid are weighted average prices for the aggregated volume of 10 BTC. More liquid assets have narrower spreads.

The summary of liquidity by exchange (BTC/USDT symbol). Green means high and red means low liquidity. Coinbase serves only as a benchmark.

Surprisingly, BitMart had the highest handy liquidity during the observed period. OKex had good estimated values of all three metrics. However, further analysis may be required, as displayed liquidity may not be equal to truly accessible liquidity. Some fraudulent exchanges allegedly display a specific type of limit orders that cannot be hit by other market participants.

6. Methodology

The system analyses market data collected via the native API of every selected exchange. Market data consists of order books (with updates if applicable) and trades.

Definition of a suspicious trade (wash trade):

  • If a trade happens at a price higher than the best ask or lower than the best bid, such a trade is deemed to be “out-of-spread trade”.
  • If a trade happens at a price lower than the best ask or higher than the best bid, such a trade is deemed to be “in-spread trade”.
  • Other trades, which happen at the best bid or ask prices, are deemed to be decent trades.

Total artificial trades are the sum of out-of-spread and in-spread trades. Total artificial volume is the sum of the volume of total artificial trades.

The overall statistics are calculated as follows:

  1. calculate the share of total artificial volume and of total artificial trades per symbol per exchange;
  2. calculate the arithmetical average (not volume-weighted average!) of the shares of total artificial volume and of total artificial trades across all selected symbols of all supported exchanges.

Due to potential issues with data collection and time synchronisation, our algorithm may detect artificial trades when there are no wash trading (Type I False Positive error). We have considered Coinbase Pro as a benchmark for 100% organic flow, however, the algorithm has detected minor anomalies.

The detection of in- and out-of-spread trades for the benchmark (Coinbase Pro).

7. Data samples

As mentioned above, we have analyzed several selected symbols, which are marked in the following table. The analysis is made on randomly selected data samples (both in February and at the beginning of March 2019).

Analyzed symbols per exchange

Our research is based on the analysis of the raw data: trades and order books. We save 10 ask and 10 bid levels of an order book. There are three types of market data protocols:

  1. Full order log (usually via FIX, ITCH, WS etc.), the most detailed type, provides the history of every single order — its placement time, its execution or cancellation;
  2. Level 2 updates (usually via FIX, ITCH, WS etc.) provide every single update (snapshot) of an order book up to N price levels; an update is a change in an aggregate size at a particular price level;
  3. Level 2 snapshots (usually via WS, REST) provide a state of an order book at the moment of a snapshot; it may be designed by an exchange (e.g., Bittrex pushes MD updates every 1 sec via WS) or constrained by the protocol and rate limits of an exchange (e.g., Yobit limits the number of requests to get an order book snapshot to 100 per minute).

Few exchanges report a timestamp of an order book. In these cases, we have used our local server time for both order books and trades, i.e. the time when either an order book or a trade was received to our server. Here is a summary of MD protocols by exchange:

The summary of market data protocols by exchange

The more detailed data we have at hand, the more scrutinous and elaborate research we can do. Timestamps with milliseconds of both order books and trades (as stamped by an exchange) are required in order to precisely match order books and trades. Poor data quality makes it harder to detect market manipulations. Therefore, as a rule of thumb, the trustworthiness of exchanges with better API is higher.

8. Known issues

Huobi. Although the API documentation claims the subscriber will receive updates upon any change, we have observed that in fact market data updates via WebSockets API are aggregated and pushed with 1-second frequency. It means that (a) the orders with the life duration of less than 1 second, which were placed and filled/cancelled between two updates, are absent, (b) the actual time when an order was placed, filled or cancelled is unknown.

Huobi. Market data updates tend to have a frequency of 1 second.

HitBTC. Although the API documentation claims that each update will be pushed, we have observed that the market data updates via WebSockets API are aggregated and pushed with 100-millisecond frequency (or a multiple of 100 ms). The implications are similar to those of Huobi. In addition, we have faced an issue with missing updates of order cancellation, which may have led to the incorrect construction of the bid-ask spread and, consequently, incorrect statistics.

HitBTC. Market data updates tend to have a frequency of 100 milliseconds.
Coinbase Pro (benchmark). Market data updates tend to have a frequency of 1 millisecond.

P.S. We would like to remind you that our fraud-detection algorithm is public and accessible in our GitHub repository. Feel free to check it and report any mistakes you may find. Any contribution or suggestions are highly appreciated! Also, do not forget to follow us on Twitter.