Machine Learning for Winning Algorithm Trading? It probably doesn’t work.

Yutong Xie
Analytics Vidhya
Published in
5 min readNov 24, 2020

At least for LSTM I am using in this article.

Most of the machine learning algorithms are not more than a kind of regression model. Namely, when machine learning algorithms are trying to predict the future, they are gonna predict a trend while assuming any deviations from the trend will reverse. This is especially the case when you apply a loss function that is in the squared error family.

To get an intuitive understanding of that, let’s look at an example. Imagine there is random noise around signal. Either gray line or the yellow line will be punished heavily by the loss function because the squared difference between data and prediction will be high. Only the orange line that’s in the middle is considered as a better “fit”. This is not necessarily bad in terms of predicting the mean, but does it generate meaning full trading signals?

In this article I am going to use LSTM model to predict the future price of SPY, the ETF of the S&P 500 index. I will also try to evaluate if that prediction can generate tradable signals for profit.

Code

In this section I explain the data and code I am using. I used python and VS Code to implement my analysis. I explain everything in a very detailed way. Please feel free to skip this part if you don’t wanna bother with the coding.

Github here.

Here are the libraries I used:

First, download the daily data and keep the closing price of each day:

Because I am trying to predict the index, I would include at least 1 complete economic cycles. SPY was first introduced in 1990 which allows it to experience 2 (dot com bubble and 08 financial crisis). 2000 is the beginning of the dot com bubble. If you want to try this yourself, you can actually start from the beginning of SPY.

This data has 5221 rows, which means there are 5221 trading days, and 1 column which is the closing price.

Second, I split the data into a training set and a testing set.

In selecting the training set, I use the first 80% of the entire sample. The line

training_data_len = math.ceil(len(dataset)*0.8)

tells us the smallest integer that is not less than 80% of the length of our data (5221 rows in total). 0.8 x 5,221 = 4,176.8. The smallest integer that is not less than 4,176.8 is 4,177, so this expression gives the length of training data — 4,177. So from observation number 4,178 to the end, 5,221 will be our testing set.

I have a variable named pstep, which represents how many days away we want to predict the price.

This step also involves a loop when constructing the x and y variables in the training data set. The rationale for the code is the following: for each closing price on a certain date, I want to use the closing prices in the previous 60 days to predict it. If the data is organized in chronicle order and if pstep = 0, the first 60 observations should predict the 61st observation, the 2nd to 61st observations should predict the 62nd observation, and so on. The index i should go from 60 to the end of the training data. There should be 60 rows for each column and each cell has 1 value which is the closing price.

Then, we need to reshape the data as LSTM expects the data to be 3-dimentional — Number of samples by Number of time steps by Number of Features. The number of time steps is 60 and the number of features is 1. The number of samples is basically how many repeats we can do to make predictions.

Then I need to build the model, compile the model and estimate the parameters (fit the model). The numbers of neurons I picked are trivial. The result is qualitatively the same with 10 or 100 neurons. I am actually over fitting here. If I over fit, then the results will be very sample specific.

Visualization of Results

So I plot the time-series of the predicted value and the realized value in this graph. I use pstep=15, which means I am predicting the price 15 days later using 60 days before.

I believe this is seen in most of the LSTM articles. If you zoom in the validation period, you will see that the model overestimates in certain periods and underestimates others. This makes the model “on average” predicting the future fine because of the mean squared error loss function.

In the next figure I am plotting the actual price 15 days later and the predicted price 15 days later together with the current closing price. I am looking at the first 100 obs.

If both the green line and the orange line is above the blue line, then both the predicted value and the future value are above the current price. Accordingly, we can buy the stock at the current price, hold for 15 days and earn a profit. If the green line is above the blue line but the orange line is below the blue line, then if we trade according to the orange line, we are gonna loose money; vice versa. If you examine the graph, most of the times we will be wrong.

Verdict

Thank you for reading! I hope we all understand model prediction and what is means to predict returns better.

--

--