Applied Fraud Detection: Benford’s Law & Stablecoins
Benford’s Law describes how the first (and 2nd, 3rd, etc) digits of naturally occuring numbers should be distributed. Below we will go a little bit deeper but the key point is that this “law” is used to detect frauds quite often and such evidence is even admissible in court sometimes.
Very roughly: if you pick a random number x and count how many numbers between 0 and x start with 1 or 2 or 3 etc and plot the distribution you expect to see something like:
Intuitively this feels right. Up to any given level the number of leading-2s can be at most the number of leading-1s. But it can also be less. Add up random piles of physical stuff and you’ll find this behavior. For the 3rd or 4th digit of a very large number we expect the distribution to be ~flat.
Stablecoins violate this law in weird ways. We are not alleging fraud here. The other, more reasonable, explanation is that stablecoins do not behave in this sort of “natural” or physical way because that is not how they are used. Bank account balances obey’s Benford’s Law. So do shop receipts. Think about your last restaurant bill — it probably was not round. Dinner for 4 off a menu never adds up to exactly a round number. But lots of people purchase 100 shares of stock, or invest 10k or do other pure-financial things in round amounts. It is also plausible to think someone might transfer 9,999 rather than 10k of some currency. Yes, this is a thing.
We are not going to try to explain what is happening right now. We are just going to prove that something weird and “unnatural” is happening with stablecoins. This lends additional credence to other statistical patterns we find because we already know this is not physics-like random noise.
Benford’s Law In Practice
This article discusses Benford’s Law and which financial businesses needed bailouts in 2008. The idea being that companies that misrepresented their financial position were more likely to need help. And it is full of graphics like:
Here is another study that looks at identifying shell companies using these statistics. And one concerning Indian stocks. These folks even look at Bitcoin manipulation. Here is a study that identifies bribery with this technique. And this article discusses broader applications with the following excellent graphic:
Websites, taxes and population data sets all obey the law. Accurate financials obey the law.
Also note that these charts are for the first digit. The distribution for the 3rd digit and onwards should be about flat at 10%. Again taken from the Wikipedia article:
Several of the linked studies also look at later digits.
Individual transactions might not obey Benford’s law if, like with the 100-shares-of-stock example, there are frictions that drive people to pick larger amounts manually.
So rather than looking at individual payment amounts let’s look at the gross flows through wallets. These numbers look and feel like the population, tax and website counts plotted above.
What do we find? Weirdness! This is the 3rd digit which should be near-ruler-flat:
All of these stables are weird. For first digits things are a bit more complex:
The expected distribution is in light blue. Some of these fit and some do not. USDP, for example, fits the 1st digit well but is way off for the 3rd. USDT is arguably the best-behaved dataset for the 3rd digit but then it is colossally off for the 1st digit.
ICIJ Fincen Transactions
So far we are looking at account aggregates. To give an idea why it is worth looking at this sample ICIJ transaction data.
For first digits things match nearly perfectly:
But for later digits there are a lot of excess zeros:
Again financial transactions themselves, not aggregates, are known to have excess zeros. But look at the 5s and 9s. There is a tiny bias but nothing like, say USDC’s 3rd digit where 9s exceed 0s and 5s are double 7s/8s. Stablecoins are still something different.
These numbers do not look like finance and they do not look like physics. As a lot of finance is modelled as physics this pairing is not shocking!
But maybe we can find new “laws of physics” for stablecoins. It makes no sense to treat these flows like economically active money running through an economy. Deriving a “velocity of money” from these flows will not mean anything like what it does in the traditional world. Even certain weird transactions like creating one stablecoin with another are meaningless from a system-wide funding perspective but might not be entirely noise.
This all needs to be reworked. What about:
- Describe some specific process for flows that obey a new “law.”
- Find on-chain evidence this phenomenon happens a lot.
- Run statistics focused on the likelihood of that being random in a world with these properties.
- Measure the occurrence of this stuff against prices.
Now we have a completely fresh perspective for signal. It’s taken a long time to formalize this and develop the tools. But we are there now.
The hardest part, truthfully, was figuring out WTF was really going on. This all makes more sense once you see that. And then it is far easier to navigate the machine.
A Note On Data
These samples cover many many accounts and are robust to sampling methodology. There are an awful lot of USDT accounts, relatively, so that is by far the largest dataset. Whether we do this in real peer-reviewed form is a question for later.