How to predict stock market using Google Tensorflow and LSTM neural network

Dmytro Sazonov
9 min readSep 19, 2022

--

Photo by David Jones on Unsplash

This is a step-by-step guide which will show you how to predict stock market using Tensorflow from Google and LSTM neural network — the most popular machine learning approach for stock market prediction from the Wall street.

This article was inspired by tons of experiments with machine learning tools and Python in my attempts to predict stock market and to find “the golden grail”, which would be a basis for trading bot. The idea with the experiment came up when I was seeking for a working tool which could generate me the passive income and earn money without my active involvement in the process. And this article is the result of those attempts. At the end of the story you will find out whether they were successful or not.

Instead of a Foreword

A lot of people try to invent different approaches for stock market prediction, probably since it was opened in May 1792. Many traders on the wall street use different technical indicators and strategies of technical analysis to trade. Some of them, more smart, use fundamental analysis and even behavioral analysis to increase their chances win the market. Some approaches work, some don’t. Meanwhile, with AI and machine learning development and proliferation among IT guys new and modern approach for stock market prediction came in, its LSTM neural network itself.

Do you remember that HBO’s TV series “Billions” where the team of young market researchers separated from the major company AxCap to launch their own investment firm based on algorithms and automated trading technologies? Now, we are like those guys.

Within this plot I don’t want to describe the structure of LSTM (long short-term memory), you can find the detailed description on Wikipedia if you want. I will focus on practical implementation and my remarks which will help you understand how to use it in your goal, — price prediction for further 3 days.

Tools and libraries

During our experiment we will be using:

  • Python
  • Google Collaboration (Colab)
  • Google Tensorflow and Keras
  • panda
  • numpy
  • sklearn
  • yahoo_fin
  • matplotlib.pyplot

Imports

First of all, let’s start from the imports. We need to import the following list of libraries which we will be using during the experiment.

Google Colab doesn’t have ‘yahoo_fin’ on board, that’s why we will need to preliminary install it using the following code:

Settings

For our neural network we will use some settings. Here’s the short explanation of each of them:

  • Parameter ‘N_STEPS’ means the number of days for our window. It is the depth which our neural network will be using;
  • Parameter ‘LOOKUP_STEPS’ is the array which includes the number and sequence of days which we will predict for. There are three days in our example [1 day — the next day, tomorrow, 2 — second day, the day after tomorrow, 3 — third day, the day after the second day]. If we want to predict for 4 or 5 days, — we need to put in here the following numbers like this [1, 2, 3, 4, 5]
  • Parameter ‘STOCK’ corresponds to ticker on the market which we will investigate in our study. For instance, in our case it is Google (GOOGL the ticker on NASDAQ);
  • As you may see we also use ‘date_now’ which is the current day and ‘date_3_years_back’ which will let us look 1104 days back in the retrospective history of the daily price for the abovementioned stock ticker;

Load data

In order to work with data we need to load it from Yahoo Finance. In our example we load 1104 bars with interval 1d (one day).

Let’s check the data that was uploaded. As we can see below we uploaded the whole list of columns by default from the web service. The column ‘open’ means the price on the opening market, column ‘close’ — price on the closing market. We also have some other columns including even the ticker itself.

But we do not need to put all these columns into the machine learning model. We just need one column ‘close’ which is the price on the closing market on the particular day.

Let’s see what we have in our ‘init_df’ table so far.

It is just only 2 columns: ‘close’ and ‘date’. We will use them further on.

While we are here and have genuine data that we retrieved from the service, let’s make a plot to show the graphic.

As for me, the chart looks good now. We see how price for the GOOGL stock has been changing during the last year. This chart is based on the price from the column ‘close’, i.e. the price of the closing market.

Scale data

As we use LSTM neural network we need to scale the data in the column ‘Close’ because the machine learning algorithm works much better with scaled than with regular data. Therefore, we need the following code:

So, what do we see in the table now?

That’s right, we see the price scaled in the range between 0 and 1. This is just the preliminary operation which we need to execute to let our model work with better efficiency. Further on, you will see why we did this.

Prepare data for the engine

So, we have data, preliminary loaded and scaled for the machine learning model. However, we need to prepare it for the next procedure. And here’s why.

Our initial goal is to predict the stock price for the upcoming three days. It means that we have to shift bars on number of days we want to predict it for and prepare the data for the model accordingly.

In our case, the column ‘close’ is the target column and we will shift it and save the results in the column ‘future’. This is our shift itself.

Also, in this script we need to calculate the last sequence which has the last window for the engine. We will predict prices for future days using that sequence.

Final things are the arrays of X’s and Y’s for the LSTM. X’s is the array of sequence on the particular step. The Y’s is array of target price on the particular step. I highly recommend you to experiment with all of these parts in Google Colab to fully understand what is going on.

Machine learning model

So, it’s time for machine learning model and its settings. As you’ve probably guessed, we use LSTM neural network and sequential model in this approach. We have a few LSTM and Dropout layers, as well as Dense layer which is final one.

Parameter ‘x_train’ is the array of data, prepared for the model to train this;

Parameter ‘y_train’ is the target column ‘close’ which we use to give the answer on our training;

60 neurons on the first layer of LSTM, 120 neurons on the second layer of LSTM. Between the LSTM layers there are hidden layers with dropped by 30% (0.3) neurons. And finally we have Dense with 20 neurons as well as just one neuron which is our result on the last step of the model.

We are gonna teach our model during 80 epochs (computations) with batch size on each epoch equal to 8.

We use ‘mean_squared_error’ as a parameter for loss function and optimizer ‘adam’ which is the most popular in tasks of stock price prediction.

Finally we will summarize the results of model teaching and return the model to the main program to let it use the model for the stock market prediction.

Definitely, it is not the comprehensive description of what is going on in here. However, I didn’t aim for it. In order to fully understand it you will need to spend at least 6 hours watching an appropriate course on Pluralsight.

Predict stock prices

Now we have everything we need to start predicting the GOOGL prices for the upcoming 3 days.

In the following script we initialize the array of predicted prices. Then, in the loop we prepare the data and train the model on each step (for each day from our array of days [1, 2, 3]). You probably remember our setting chapter where we set to LOOKUP_STEPS the value = [1, 2, 3]. Three days: first, second and the third one.

I must warn you that the execution of this loop can take a while — I counted at least 10 minutes for all 3 days. It is like data mining process. But finally we have this “desirable result”.

Ok, finally we got the results from our neural network and what we can do with it ? We can do many things except the trading itself. I do not recommend to trade using just this very limited information. This is just a prediction from the artificial intelligent tool, no more or no less. To be able to trade successfully you will need much more than just a closing price on the trading day.

However, using this simple example you might start thinking on using machine learning and recurrent neural networks in your trading strategies as it’s already in use on the Wall street.

In fact, from the picture above I can assume that it as a FLAT market. I would rather do nothing within these market conditions.

Accuracy is 98%. How have I calculated this?

What is the accuracy? It is the difference between the expected result and the actual stock price on the long retrospective of prices. Let’s go back to the listing of our model training process and look attentively.

As we can see from the listing, there is the “loss” which is equal to 0,0024 what is approximately 2% from our scaled values.

Result chart

To show the accuracy on the historical data and predicted bars let’s execute the model for the whole range ‘x_train’ and prepare the resulted chart. I added comments to all parts of the following code. I recommend to experiment with this data step-by-step using Google Colab. See the link at the end of the article.

The next chart shows the predicted prices for the upcoming 3 days (purple small line), predicted price for the whole period (dashed blue line) and the actual price (orange solid line)

With the ‘red solid line’ I have highlighted the predicted prices for the future days.

This is not the end yet

At the beginning of this article I promised you that you will get an answer whether I succeeded or not in my attempts to predict the market. Generally speaking, I had a “success” in stock price prediction with the accuracy of 98%, you can check it on the charts above. But it is not enough to trade. Even if you know the price for upcoming 3 days it doesn’t mean that you will be able to use this information for trading on regular basis.

Knowing the price doesn’t mean you will be able to use this price within the market itself. You need a ‘little’ more for success.

My experiments with automated trading algorithms show that to trade, and what is more important, to trade with more profits than losses, you have to take into account a lot of different other parameters like ‘stop loss’, ‘take profit’, ‘trading amount’, etc. So, in order to successfully trade on the market you need much more than just a target price.

To be continued

In terms of TV series “Billions” from HBO, we are at this point when we have just moved to the office of “Mason Capital” with intent to invent universal algorithm to earn money for our investors and for ourselves.

In the next story I am gonna touch on another important thing in automated trading, — trading algorithm itself. That algorithm will include everything what we need to trade on the market.

I hope you have got what you were seeking for.

See you then.

Links

Colab is in here: https://colab.research.google.com/drive/1Z5Wf-Syma-vr86q-AIf-3AVDwgzFvNoj

--

--

Dmytro Sazonov

Blockchain enthusiast and artificial intelligence researcher. I believe with these tools we will build better tomorrow :)