Can We Train a Neural Network to Read Stock Market Charts?

Published in

Artificial Intelligence in Plain English

5 min readAug 17, 2020

Some people claim they can look at a stock price chart, find patterns like head and shoulder, and predict in which direction it will move. There is also another group of people, mostly academics and high profile investment analysts, who don’t believe in any of this.

Funny fact is that we have successful traders and investors in both groups. So, who’s right and who’s wrong?

I thought we can put it into a test by trying to train a convolutional neural network(CNN) to see whether there’s merit in chart reading. If we succeed and find a model that can predict better than pure chance, there’s at least some proof of usability of technical analysis. Obviously, if we fail to find such a model, that wouldn’t prove chart reading doesn’t work!

In the process, we’ll build a working model that can be expanded pretty easily. We have to make a lot of assumptions to create this model and you may disagree with some or all of them. As you will see, it’s easy to change many of those assumptions without actually changing the code. Parameters are defined separately. You can even expand the model by simply copy/paste a layer!

Some Disclosures Before We Get Started

First of all, I call it a study, because it’s not really a complete project. I worked on this over a few weekends, just to play around with TensorFlow and see if it’s possible to make something useful.

There are potentially many bugs, errors, or even mistakes on the way I coded it. I would be more than happy to learn about them and improve this work. So, please don’t hesitate to share your thoughts in the comment section. I try to respond to them as soon as I can.

Basic Stuff about Python and Machine Learning

Python is my favorite computer language. There are plenty of packages, frameworks, and ready to use code that can be easily expanded. The language itself is very robust, with a huge number of active contributors. If you’re new to Python, I recommend taking this amazing course on EdX.

TensorFlow is a comprehensive open-source library for machine learning developed by Google. There are plenty of training materials available for machine learning. The best thing I found so far is a book!

The kind of Neural Network I used is called Convolutional Neural Network (CNN) and is the main type of network used for Machine Vision. You don’t really need to learn the math behind convolution, which is complex. All you need to know is that through this type of network, computers learn to find patterns in a picture. These patterns can be different compared to patterns a human sees. Computers can only process numbers and pictures are no different. When we use a CNN to train a model, a computer learns to find patterns in any part of the picture.

CNNs are proven to be pretty good at classifying pictures into different categories. You can check this and this as simple examples of how easily a CNN can be used to classify images. So, I believe we can safely assume that if there’s any recurring pattern in stock charts, CNNs should be the best candidate to find such a pattern without any bias.

How All of this is Organized?

I created a Jupyter Notebook in the Google Colab platform. To get a better sense, you just need to take a look at it and run it. It will work smoothly, but slowly!

The code is all in Python 3.x, use a handful of standard libraries, and of course TensorFlow for Neural Network capabilities.

My rationale is pretty simple. We have a limited list of stock symbols that I assumed would have similar behavior. The code gets the historical stock prices from Yahoo! Finance, randomly choose periods of 50 trading days and use the first 45 close prices for creating a chart and the next 5 trading days for labeling the chart as a negative return, zero return or positive return.

All the hard work for getting stock prices, randomly choosing trading data, preparing charts, and saving these to drive is being handled with some classes I developed. It’s completely separate from the rest of the code. So if you’re new to Python, don’t worry about getting lost in the detail.

You can easily change the parameters and assumptions I made as they are separate from the rest of the code.

Parameters and Assumptions

I made some assumptions, such as how many trading days should be used to draw a chart or which symbols to include in our study.

There are also plenty of parameters, from the size and resolution of the chart to the number of charts to be drawn for each stock or what activation function to use in each layer of our model.

I tried to provide enough comments to make it clear what assumptions are made and what parameters are being used.

You should be able to freely change those parameters and assumptions. The process is pretty simple, take a copy of my Google Colab Notebook and change it as you would like.

Notebook

Google Colab can’t be directly imported to Medium, at least as far as I know! So I used Github gist as a way to import it.

You can also access it directly through this link.

Training Data Vs Validation Data

To make sure that we’re checking our model using data it hasn’t seen before, I used min_date and max_date parameters of my DataSet class.

For training data, we used trading data from the past 3,600 days until the last 120 days. For validation data, we only used the trading data for the last 120 days.

Our model is challenged to predict outcomes of trading days it has never seen before on the same stocks.

If we find a successful model, we’ll test it on trading data for stocks that it never saw before in the last 120 days.

What’s the Final Answer?

If you run the code, you’ll see that the network will converge pretty fast, getting an accuracy of more than 0.99. However, the Validation accuracy is about 0.33 which means it’s similar to chance!

My data pipeline is designed to balance data fed to the network. So what we feed to the network for training are about 1/3 charts that have been categorized as negative, 1/3 with zero, and 1/3 with positive returns.

The network picks up fast on patterns that are in the training data, categorizing them almost perfectly, but it fails to categorize validation data any better than categorizing by flipping a coin.