Machine Learning for Automated Cryptocurrency Trading

Published in

SCB TechX

9 min readJun 26, 2023

Introduction

Data analysis techniques are presently widely used in a variety of industries, including healthcare, marketing, and investment, in order to use the findings for planning to cope with varied situations that have high volatility in the worldwide market. Using historical data, personal experience, or statistical research, I can make predictions about the future. If the prognoses are correct, I can better prepare for taking risks. Data about investing in cash, government bonds, corporate bonds, mutual funds, stocks, derivatives, or cryptocurrencies is difficult to analyze due to the unique volatility associated with each asset type. It may lead to a more challenging analytical procedure when dealing with highly volatile assets. Notably, the kinds of assets in the realm of cryptocurrencies. These assets are sophisticated and unstable assets because the market is always open (24x7) and prices always change to reflect market conditions. Therefore, decision-making becomes crucial as human emotions such as fear or greed will impact trading.

One of the most obvious changes in the financial sector over the past decade has been the rise of algorithmic trading, or computerized trading systems. According to investment bank and financial services firm Morgan Stanley, research from 2012 revealed that 84 percent of all trades on U.S. Stock Markets were carried out by computer algorithms, while just 16 percent were executed by individual investors (Financial Times, 2012). Automated trading systems generate specific trading choices using sophisticated quantitative models, submit orders automatically, and handle those orders after submission. For example, market declines serve as severe stress test for an algorithmic trading strategy’s usability and competitiveness. The devastating effects of the worldwide Coronavirus pandemic in 2020 saw financial markets in disarray and significant losses for many investors. In March 2020, U.S. markets experienced a 38 percent decline, recovering in April 2020 and the following months. Numerous marketplaces and exchanges throughout the world showed a similar pattern.

Algorithms that can more accurately anticipate stock prices offer a substantial financial incentive for professionals with access to stock prices. Aside from the possibility of generating multimillion-dollar returns, such algorithms would have additional advantages. One such advantage is the ability to identify poor investments that are bound to fail, thereby lessening the likelihood of large disruptions and market collapses when they occur. Another advantage of a successful algorithm is that it may be extended to other fields with comparable issues. In a study conducted by Kumar et al. (2018), the behavior of the stock market was analyzed and the authors determined the best-fitting model from several traditional machine learning algorithms, which included Random Forest (RF), Support Vector Machine (SVM), Naive Bayes, K-Nearest Neighbor (KNN), and Softmax for stock market predictions.

Figure1: Cumulative 3 Year Returns: AI Hedge Funds vs All Hedge Funds (Friedman, 2019)

Figure2: Long Term Analysis AI vs Quants vs Traditional Hedge Fund Indices (Eurekahedge, 2018)

This study firstly focuses on predicting changes in cryptocurrency prices utilizing Deep Learning Algorithms, Traditional Machine Learning techniques, and Traditional Indicators. In order to improve prediction accuracy, I investigated different deep learning methods, including Simple Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU), as well as Light Gradient Boosting Machine and Moving Average. I also considered several approaches for constructing the dataset used to train algorithms. In order to fully utilize the best prediction models, I then adjusted a trading strategy.

Data

SOL-USDT

Data was gathered from Binance API (using the code snippet “from binance.client import client”). The period of the collected hourly data is from December 1, 2020, to April 30, 2022, as shown in Figure 3 and Table 1. Details of this cryptocurrency were collected as follows:
• Symbol
• Date
• Closing price
• Volume
• Number of trades

Table1: Descriptive statistics for SOL USDT

Figure3: SOL USDT Hourly Historical Closing Price

2.BTC-USDT

Data was gathered from Binance API (using the code snippet “from binance.client import client”). The period of the collected hourly data is from December 1, 2020, to April 30, 2022, as shown in Figure 4 and Table 2. Details of this cryptocurrency were collected as follows:
• Symbol
• Date
• Closing price
• Volume
• Number of trades

Table2: Descriptive statistics of BTC USDT

Figure4: BTC USDT Hourly Historical Closing Price

Methodology

The suggested process design is depicted in Figure 5. After getting the data set from the Binance API, the first stage was to undertake data exploration, which included checking the completeness of the data, exploring the type of data, and investigating if the data matches the normal distribution. Following that, features were chosen and the data underwent pre-processing. Experiments on various models were then carried out. I made a prediction and used the outcome to perform the data inverse transform. Following the data inverse transform, I converted the return to the closing price before making the prediction and assessment. The next phase was to back-testing a trading strategy using five models: RNN, LSTM, GRU, LightGBM, and MA.

Feature selection

I selected daily returns over prices for constructing both the BTC dataset and the SOL dataset. Because most price series are geometric random walks, the anticipated value of error in a regression must be zero. However, returns are frequently distributed randomly around a mean of zero as shown in Figure 6 and Figure 7.

Figure7: SOL-USDT Histogram of Hourly Returns

Problem statement
Model
Y4 = [R1, R2, R3]
Input
- R1 is the return of SOL from day 1 to day 2.
- R2 is the return of SOL from day 2 to day 3.
- R3 is the return of SOL from day 3 to day 4.

For example, [0.43412612, 0.43116426, 0.43277213] → [0.43006414]
- The return is calculated as the percentage difference between the buying and selling prices.
Output
- Y4 is the return of SOL from day 4 to day 5.
Objective
- I used the returns of SOL itself from day 1 to day N to predict the return at day N+1. So, N is the window size that represents how far I want the model to look back (for this study, I used N = 4).
Data Preprocessing
Input preparation
I created a sequence return feature from hourly closing prices by using a window size (N) = 4.
Train Test Validation Split
The data was split into training, validation and test sets using a split ratio of 80%, 10% and 10%, respectively, with a random state of 0. Total transactions consisted of 12,346 records. These transactions were split into training, validation, and test sets comprised with 9,876, 1,236 and 1,234 records, respectively. More details can be found in Figure 8.

Data normalization
To account for the minimal variation in return, the features underwent a transformation using a min-max scaler with a range of 0 to 1. The transformation was given by using the following formulas.
𝑋𝑠𝑐𝑎𝑙𝑒𝑑 = 𝑋𝑠𝑡𝑑∗(max−min )+𝑚𝑖𝑛

𝑋𝑠td = (x-min)/(max-min) where min, max is range 0 to 1

Architecture

There are five models that must be implemented and improved for optimal performance: Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU) as deep learning models, along with Light Gradient Boosting Machine (LightGBM) and Simple Moving Average (SMA). Because I would like to determine the impact of each deep learning model if the model is altered, I employed the same architecture for all deep learning models.

Deep learning models (RNN, LSTM, and GRU)
These models consist of three layers with 128/256/256 neurons respectively and one single neuron dense layer with Linear activation function as the output layer. Each LSTM layer utilizes ReLU activation function. I set the dropout rate of 0.25 and also applied it to each of these layers. The first two layers return sequences had the same shape as the input sequence. The last LSTM layer returns only contained the last output. Moreover, I used Stochastic gradient descent (SGD) for optimization algorithms with a learning rate of 0.0001, decay of 1e-5, momentum of 0.9 and Nesterov set to True. I used Mean Squared Error (MSE) and Mean Absolute Error (MAE) as loss functions. The summary is depicted in Table 3.

Light Gradient Boosting Machine (LightGBM)
LightGBM is a quick and high-performance model that enables efficient Gradient Boosting. It is a learning technique to create high precision models. It leverages a learning technique that learns from the accumulated mistakes caused by the predictions of the previously formed model. I specified parameters like {‘boosting type’: ‘gbdt’, ‘objective’: ‘regression’, ‘metric’:’l2', ‘num leaves’:10, ‘max depth’:5, ‘drop rate ‘: 0.3, ‘reg sqrt’: True, ‘boost from average’: True, ‘learning rate’: 0.0001, ‘verbose’: 0}. For the training process, I set the ‘num boost round’ to 1000, ‘early stopping rounds’ to 100, and ‘verbose eval’ to 50. The summary is depicted in Table 4.

Simple Moving Average (SMA)
The SMA is simply the average price over a specified period, in this case (N=4).

Results
Evaluation
After the experiments, as shown in Figure 9, the MA model performed best with the lowest MSE of 1.88, RMSE of 1.37, and MAE of 0.99. The second-best model was RNN, with MSE, RMSE, MAE of 2.75, 1.66, and 1.21, respectively. However, if I look at the Coefficient of Variation (CV), I found that the simple RNN model outperformed others with the lowest CV of 6.397% as shown in Figure 10.

Figure 10: Coefficient of variation (CV)

Back-testing Trading Strategy
In order to build an automated trading system, I designed a trading strategy that uses our model predictions as input and outputs with actual buy/sell orders. I constructed two portfolios for back-testing and set a budget 1,000 USDT. The first portfolio only traded a single coin, that was SOL-USDT. The other portfolio traded multiple coins, namely SOL-USDT and BTC-USDT. I used the same trading strategy for the two portfolios to generate buy/sell signals. If the model forecasted an increase in price at time t+1 and our portfolio didn’t contain SOL or BTC coin, I would buy SOL-USDT or BTC-USDT into the portfolio by using all budgets. If the model forecasted a decrease in price at time t+1 and our portfolio held SOL or BTC coins, I would sell all SOL-USDT or BTC-USDT from the portfolio. If the actual value of the portfolio at the current period fell below the stop loss threshold, I would sell all SOL-USDT or BTC-USDT from the portfolio. The stop loss was calculated as a percentage of the total cost*(1-Stopper). The stopper was set at 15%. For the multiple-coin portfolio, I set some assumptions to perform back-testing. The total testing period was 1,236 hours. The first part (hours: 1–617) used the models to trade only SOL. At hour 618, I forced the model to sell all SOL coins if they were hold in our portfolio. During the latter part (hours: 619–1,236), I used the models to trade only BTC.
For the single-coin portfolio, the RNN model is the only model that can generate a profit during a volatile market based on our customized trading strategy as shown in Table 5 and Figure 11. Similarly, for the multiple coin-portfolio, the RNN model is also the model that can generate the highest profit during a volatile market based on our customized trading strategy as shown in Table 6 and Figure 12. I have summarized the results of the two portfolios and the details of executing transactions.

Machine Learning for Automated Cryptocurrency Trading

Written by Mathee Prasertkijaphan