Using Benford’s Law to detect volumes manipulation in crypto exchanges

Gautier Humbert
Koinju

--

“Benford’s Law” is a natural, empirical and statistical law known for detecting accounting fraud. It reveals a statistical distribution that is not intuitive: in accounts — from a large number of account entries — there are more numbers that start with a 1 than 2, more numbers that start with 2 than 3 etc… The distribution is decreasing and looks like this:

Research Gate — Constantin Rasinariu

So, it could be a good idea to test it through crypto volumes and detects if there are market manipulations. The main advantages of the use of Benford’s Law is we don’t need to compare results with other exchange’s results and we don’t need to set an arbitrary threshold to state whether an exchange is cheating.

Global results

We analyzed BTC/USD pairs among 11 exchanges (when it was possible, otherwise we used BTC/USDT spot pairs) : Binance (USDT), Bitfinex, Bitstamp, Bittrex, Coinbase, Huobi (USDT), Kraken, ItBit, Gemini, FTX and Okex (USDT). We used 15 min volumes (denominated in BTC) from 1st December 2020 to 31st May 2021 (approximately 17000 data for each exchange). Here are the global results :

One thing immediately appeared : Binance’s distribution doesn’t conform with the Benford’s Law. The volumes beginning with a digit between 1 and 3 are underrepresented and those beginning with a number between 4 and 9 are overrepresented. On the contrary, we observe the exact opposite on Huobi. Globally, exchanges that do not fit with Benford are the three Asiatic exchanges, Coinbase and Bittrex for the first digit.

Testing the first two digits of the volumes, we get very similar results. Binance’s distribution has the same behavior — numbers starting with a number between 10 and 30 are underrepresented and those starting with a number above 40 are overrepresented. Then, the ones that deviate the most from Benford’s distribution are Huobi and Binance. Some deviate a little less : Coinbase, Bittrex and Okex. Analyze the two first digits reduce their deviations.

Case of a distribution that tracks Benford perfectly : ItBit

ItBit is a very good example of Benford-like digits distribution, as you can see with the results below :

ItBit’s First Digit Test
ItBit’s Two First Digits Test

Indeed, ItBit Chi-squared statistics are very low : they are considered acceptable below 20.09 for one digit and below 122.94 for two digits with a 99% confidence level. Second order test consists of testing the distribution of the last two numbers against Benford’s distribution. Concerning the second order test for one digit, we assume that humans usually tend to make transactions which are multiple of 5 — that’s why there are more volumes finishing by 5 — but we don’t know why there are less volumes finishing by 3 or 6. We find the same pattern with Bittrex and Gemini. It would be interesting to discover why that happens. So we can’t be suspicious regarding the volumes announced by ItBit (but that does not mean that one should not be).

Case of a distribution that doesn’t track Benford : Binance

Below are the detailed results of the analysis of Binance’s distribution:

Binance’s First Digit Test
Binance’s Two First Digits Test

Two things are very disturbing with Binance : it’s the distributions that deviate the most, but the second order tests are fitting perfectly. The deviations are confirmed with Chi-squared statistics, which are very high (except digits 3 and 9). This is suspicious but we can’t confirm there are market manipulations because many factors have to be taken into account.

Does it result from the USDT pair using ? We can’t reject this hypothesis actually, because the three Asiatic exchanges are represented in the analysis with a BTC/USDT pair and the three deviates. Maybe there is a cultural factor in these markets? It’s also possible, but the Binance’s deviations are widely bigger than others, and Okex deviates as much as Coinbase and Bittrex… Concerning the cultural aspect by the way, we know that in China the number 8 brings good luck and 4 bad luck. We don’t observe it because it’s possible that this behavior is hidden by the aggregation of the individuals trades. Also, it’s possibly a different general psychology with the markets. Moreover, Chinese regulation is very different and may have a significant impact, as we don’t observe a large deviation on US-regulated exchanges like ItBit and Gemini (slight on Coinbase). More statistically, it would be interesting to measure the “chaos” in datasets to see whether there is phenomenon of hype which increase the volatility for example and distort a distribution close to Benford’s Law. We’re currently studying this possibility thanks to the Lyapunov’s exponent with the help of Adrien Bonache, lecturer in management sciences at the University of Burgundy.

Statistical tests

It could be difficult to judge if a distribution does conform or not to Benford’s Law. We used three different tests’ statistics to lead this analysis: the Mean Absolute Deviation, the Chi-squared and the Kolmogorov-Smirnov test. These tests could give different results following their calculation method. They could be sensitive to the number of data and different parameters which makes them more or less severe. So, what did we observe ?

Mean Absolute Deviation
Chi-Squared Test
Kolmogorov-Smirnov Test

Globally, the tests’ statistics are in accordance. The problem is the statistics are used to determine whether the distribution is conform to Benford’s Law. As we seen, the MAD is useful because it has permitted to have a degree of appreciation of the conformity according to the value of the MAD. For the two other tests, at a confidence level of 99%: Chi-squared’s test accepts only ItBit for the first digit and Kraken, ItBit, Gemini and FTX for the first two digits and the Kolmogorov’s test accepts no one for the first digit and for the first two digits. It’s very difficult to know whether we have false-positive and/or false-negative. Then, we have to cross-check tests and exercise good judgment.

We can conclude that Binance and Huobi are far from Benford’s Law conformity, which is suspicious but not sentencing. On the contrary, FTX, ItBit, Gemini and Kraken are very close to the theoretical distribution which gives the impression of cleanliness of data. In the same way, this doesn’t mean that they are blameless. Nevertheless, we must use the Benford’s Law as a first tool to indicate us what is suspicious to investigate more, not as a definitive conclusion. We should therefore investigate more individual trades on the BTC/USDT pairs from Huobi and Binance to find out the truth and gather evidence to make sure this is true.

--

--