WatermelonBlock Technical Blog #2 — The Subtle Art Of Detecting Cryptocurrency Manipulation — Part 1 (Pump-and-Dump)

By: Karthik Ramesh Kamath, WatermelonBlock Data Scientist

There are two kinds of investors on this planet — the ethical ones and…..the sleazy ones.

That’s right, good ol’ yin and yang (shrug).

Unfortunately, the cryptocurrency markets (a.k.a crypto-markets) have been found to be highly vulnerable to sleazy manipulation schemes. Ones that can not only influence the investors’ decisions, but also rattle the market equilibrium, as a whole. Malicious activities occur mainly because crypto-markets, unlike traditional stock markets, are not protected by any regulatory measures that could forbid unethical trading strategies from being used. Basing bid (buy) and ask (sell) decisions on the wrong sentiment signals doesn’t really end well for investors and in worst cases, even leads to their bankruptcy.

A recent investigation conducted by Wall Street Journal discovered that:-

“Several trading groups manipulated the price of multiple digital tokens to the tune of $825 million (USD) in trading activity over the course of this year, resulting in millions of dollar in losses for those who fell for the scams.

Without any further ado, let’s jump and see how deep the rabbit holes are!

Price manipulation activities can be classified into three categories. Information-based manipulation attempts to spread false information, which influences the fair price of crypto-assets. Action-based manipulation can manipulate the demand / supply of crypto-assets and trade-based manipulation — the third scheme, creates bogus buy/sell orders to control crypto-asset movements.

The most dreaded trade-based manipulation models that tend to influence the crypto-markets are:

1. Pump-and-Dump

2. Spoof Trading

In this article, we will be briefly discussing about how pump-and-dump occurs and how machine learning techniques could be used to detect this anomaly.

A trading event is considered as a ‘pump-and-dump’, when hostile entities artificially raise the price (the pump) of a crypto-asset and sell it subsequently to make a chunky profit (dump). During the pumping phase, the manipulators artificially increase demand by issuing large buy orders at the market price of the currency being traded.

These massive buy orders compel other unsuspecting traders to purchase the assets at a slightly higher price or risk not having their orders completed. This only compels the market price to soar high.

The manipulators rarely make a purchase during the pump stage, since they continually cancel their buy orders just before they are filled. At the end, they decide to deploy massive sell orders rapidly and execute the dump. Therefore, the other investors who are not cautious about the orders from the manipulator would have bought the crypto-assets at a higher price than usual.

Okay, enough nightmares already.

So, how on earth can we possibly detect pump-and-dump schemes?

The effectiveness of detecting pump-and-dump events, is dependent on how much ‘trade information’ we have. By ‘trade information’, it depicts the pricing data points that buyers and sellers had infused to the crypto-market, during transactions.

The trade information can be classified into two levels — Level 1 data and Level 2 data

Level 1 data comprises the buy/sell orders that have been successfully executed. It has the format of opening price, high price, low price, closing price and volume (OHLCV) within a specific time period. These details are usually accessible by the public, thus easy to obtain for analysis.

Level 2 data comprises the OHLCV values and also details like depth of market (buy / sell orders) and market participant identifier (MPID) that have not matched. It shows every order that has been entered, cancelled, or matched.

The MPID turns out to be an important parameter that can used to find if the actions originate from the same person. But it happens so, that this identifier data can only be accessed by market authorities, and not made public. This is because an investor’s transactional details will be compromised big time and eventually lets other people know what he/she is doing only to immorally benefit from it.

Order cancellation data gives the price and the size of orders that investors withdraw. Irregular size of order cancellations in a short period of time can be used as an important indicator for detecting crypto price manipulation.

Classification-based learning algorithms like k-Nearest Neighbors and One Class SVM, along with advanced mathematical frameworks like Hidden Markov Chain Models can be deployed on these datasets and spot manipulations in a jiffy.

Although Level 2 data gives accurate outputs, Level 1 data is used for pump-and-dump predictions, since the latter’s variables expose the intention of the manipulator.

We start by transforming the Level 2 data into Level 1 data and reducing the sparsity of the matrix by removing records with excessive null-values. The inherent parameters of the resultant Level 1 dataset are then fed as model training features to the machine learning algorithm. Apart from the inherent features, the training data must contain data-points pertaining to historical pump-and-dump events that had taken place earlier and must be fed during training, for the model to learn.

The probability of price manipulation is created as a separate parameter during feature engineering and chosen as the target class for testing out the learning model. Once these algorithmic models learn from the training data, get validated on testing data and deployed in production, they gain the incredible super power to predict the occurrence of a pump-and-dump event in the near future.

Key Takeaways

It’s quite a daunting task for investors to intuitively find when such schemes will be used, owing to the stochastic nature of human intentions. But, these manipulators do leave a breadcrumb trail (karma, perhaps?). A trail that powerful computing engines could identify and alert investors of such manipulation events like the above. It is imperative that investors reconsider their investment strategies by incorporating trustworthy real-time market analysis tools in their crypto-trading arsenal or…risk ending up in the long list of innocent victims, that have lost money. In the next article, we shall discuss about spoof trading and the mathematics behind these anomalies.

Stay tuned to this technical blog series to find out more on WatermelonBlock’s engineering culture and other exciting news that we have for you!

Reference:

  1. “Some Traders Are Talking Up Cryptocurrencies, Then Dumping Them, Costing Others Millions”, https://www.wsj.com/graphics/cryptocurrency-schemes-generate-big-coin/
  2. “Cryptocurrency Pumping Predictions: A Novel Approach to Identifying Pump And Dump Schemes”, Cameron Ramos, Noah Golub et al.