A simple deep learning model for stock price prediction using TensorFlow

Nov 9, 2017 · 13 min read

Importing and preparing the data

Our team exported the scraped stock data from our scraping server as a csv file. The dataset contains `n = 41266` minutes of data ranging from April to August 2017 on 500 stocks as well as the total S&P 500 index price. Index and stocks are arranged in wide format.

`# Import datadata = pd.read_csv('data_stocks.csv')# Drop date variabledata = data.drop(['DATE'], 1)# Dimensions of datasetn = data.shape[0]p = data.shape[1]# Make data a numpy arraydata = data.values`

Preparing training and test data

The dataset was split into training and test data. The training data contained 80% of the total dataset. The data was not shuffled but sequentially sliced. The training data ranges from April to approx. end of July 2017, the test data ends end of August 2017.

`# Training and test datatrain_start = 0train_end = int(np.floor(0.8*n))test_start = train_endtest_end = ndata_train = data[np.arange(train_start, train_end), :]data_test = data[np.arange(test_start, test_end), :]`

Data scaling

Most neural network architectures benefit from scaling the inputs (sometimes also the output). Why? Because most common activation functions of the network’s neurons such as tanh or sigmoid are defined on the `[-1, 1]` or `[0, 1]` interval respectively. Nowadays, rectified linear unit (ReLU) activations are commonly used activations which are unbounded on the axis of possible activation values. However, we will scale both the inputs and targets anyway. Scaling can be easily accomplished in Python using sklearn’s `MinMaxScaler`.

`# Scale datafrom sklearn.preprocessing import MinMaxScalerscaler = MinMaxScaler()data_train = scaler.fit_transform(data_train)data_test = scaler.transform(data_test)# Build X and yX_train = data_train[:, 1:]y_train = data_train[:, 0]X_test = data_test[:, 1:]y_test = data_test[:, 0]`

Introduction to TensorFlow

TensorFlow is a great piece of software and currently the leading deep learning and neural network computation framework. It is based on a `C++` low level backend but is usually controlled via Python (there is also a neat TensorFlow library for R, maintained by RStudio). TensorFlow operates on a graph representation of the underlying computational task. This approach allows the user to specify mathematical operations as elements in a graph of data, variables and operators. Since neural networks are actually graphs of data and mathematical operations, TensorFlow is just perfect for neural networks and deep learning. Check out this simple example (stolen from our deep learning introduction from our blog):

`# Import TensorFlowimport tensorflow as tf# Define a and b as placeholdersa = tf.placeholder(dtype=tf.int8)b = tf.placeholder(dtype=tf.int8)# Define the additionc = tf.add(a, b)# Initialize the graphgraph = tf.Session()# Run the graphgraph.run(c, feed_dict={a: 5, b: 4})`

Placeholders

As mentioned before, it all starts with placeholders. We need two placeholders in order to fit our model: `X` contains the network's inputs (the stock prices of all S&P 500 constituents at time `T = t`) and Y the network's outputs (the index value of the S&P 500 at time `T = t + 1`).

`# PlaceholderX = tf.placeholder(dtype=tf.float32, shape=[None, n_stocks])Y = tf.placeholder(dtype=tf.float32, shape=[None])`

Variables

Besides placeholders, variables are another cornerstone of the TensorFlow universe. While placeholders are used to store input and target data in the graph, variables are used as flexible containers within the graph that are allowed to change during graph execution. Weights and biases are represented as variables in order to adapt during training. Variables need to be initialized, prior to model training. We will get into that a litte later in more detail.

`# Model architecture parametersn_stocks = 500n_neurons_1 = 1024n_neurons_2 = 512n_neurons_3 = 256n_neurons_4 = 128n_target = 1# Layer 1: Variables for hidden weights and biasesW_hidden_1 = tf.Variable(weight_initializer([n_stocks, n_neurons_1]))bias_hidden_1 = tf.Variable(bias_initializer([n_neurons_1]))# Layer 2: Variables for hidden weights and biasesW_hidden_2 = tf.Variable(weight_initializer([n_neurons_1, n_neurons_2]))bias_hidden_2 = tf.Variable(bias_initializer([n_neurons_2]))# Layer 3: Variables for hidden weights and biasesW_hidden_3 = tf.Variable(weight_initializer([n_neurons_2, n_neurons_3]))bias_hidden_3 = tf.Variable(bias_initializer([n_neurons_3]))# Layer 4: Variables for hidden weights and biasesW_hidden_4 = tf.Variable(weight_initializer([n_neurons_3, n_neurons_4]))bias_hidden_4 = tf.Variable(bias_initializer([n_neurons_4]))# Output layer: Variables for output weights and biasesW_out = tf.Variable(weight_initializer([n_neurons_4, n_target]))bias_out = tf.Variable(bias_initializer([n_target]))`

Designing the network architecture

After definition of the required weight and bias variables, the network topology, the architecture of the network, needs to be specified. Hereby, placeholders (data) and variables (weighs and biases) need to be combined into a system of sequential matrix multiplications.

`# Hidden layerhidden_1 = tf.nn.relu(tf.add(tf.matmul(X, W_hidden_1), bias_hidden_1))hidden_2 = tf.nn.relu(tf.add(tf.matmul(hidden_1, W_hidden_2), bias_hidden_2))hidden_3 = tf.nn.relu(tf.add(tf.matmul(hidden_2, W_hidden_3), bias_hidden_3))hidden_4 = tf.nn.relu(tf.add(tf.matmul(hidden_3, W_hidden_4), bias_hidden_4))# Output layer (must be transposed)out = tf.transpose(tf.add(tf.matmul(hidden_4, W_out), bias_out))`

Cost function

The cost function of the network is used to generate a measure of deviation between the network’s predictions and the actual observed training targets. For regression problems, the mean squared error (MSE) function is commonly used. MSE computes the average squared deviation between predictions and targets. Basically, any differentiable function can be implemented in order to compute a deviation measure between predictions and targets.

`# Cost functionmse = tf.reduce_mean(tf.squared_difference(out, Y))`

Optimizer

The optimizer takes care of the necessary computations that are used to adapt the network’s weight and bias variables during training. Those computations invoke the calculation of so called gradients, that indicate the direction in which the weights and biases have to be changed during training in order to minimize the network’s cost function. The development of stable and speedy optimizers is a major field in neural network an deep learning research.

`# Optimizeropt = tf.train.AdamOptimizer().minimize(mse)`

Initializers

Initializers are used to initialize the network’s variables before training. Since neural networks are trained using numerical optimization techniques, the starting point of the optimization problem is one the key factors to find good solutions to the underlying problem. There are different initializers available in TensorFlow, each with different initialization approaches. Here, I use the `tf.variance_scaling_initializer()`, which is one of the default initialization strategies.

`# Initializerssigma = 1weight_initializer = tf.variance_scaling_initializer(mode="fan_avg", distribution="uniform", scale=sigma)bias_initializer = tf.zeros_initializer()`

Fitting the neural network

After having defined the placeholders, variables, initializers, cost functions and optimizers of the network, the model needs to be trained. Usually, this is done by minibatch training. During minibatch training random data samples of `n = batch_size` are drawn from the training data and fed into the network. The training dataset gets divided into `n / batch_size` batches that are sequentially fed into the network. At this point the placeholders `X` and `Y` come into play. They store the input and target data and present them to the network as inputs and targets.

`# Make Sessionnet = tf.Session()# Run initializernet.run(tf.global_variables_initializer())# Setup interactive plotplt.ion()fig = plt.figure()ax1 = fig.add_subplot(111)line1, = ax1.plot(y_test)line2, = ax1.plot(y_test*0.5)plt.show()# Number of epochs and batch sizeepochs = 10batch_size = 256for e in range(epochs):    # Shuffle training data    shuffle_indices = np.random.permutation(np.arange(len(y_train)))    X_train = X_train[shuffle_indices]    y_train = y_train[shuffle_indices]    # Minibatch training    for i in range(0, len(y_train) // batch_size):        start = i * batch_size        batch_x = X_train[start:start + batch_size]        batch_y = y_train[start:start + batch_size]        # Run optimizer with batch        net.run(opt, feed_dict={X: batch_x, Y: batch_y})        # Show progress        if np.mod(i, 5) == 0:            # Prediction            pred = net.run(out, feed_dict={X: X_test})            line2.set_ydata(pred)            plt.title('Epoch ' + str(e) + ', Batch ' + str(i))            file_name = 'img/epoch_' + str(e) + '_batch_' + str(i) + '.jpg'            plt.savefig(file_name)            plt.pause(0.01)# Print final MSE after Trainingmse_final = net.run(mse, feed_dict={X: X_test, Y: y_test})print(mse_final)`

Conclusion and outlook

The release of TensorFlow was a landmark event in deep learning research. Its flexibility and performance allows researchers to develop all kinds of sophisticated neural network architectures as well as other ML algorithms. However, flexibility comes at the cost of longer time-to-model cycles compared to higher level APIs such as Keras or MxNet. Nonetheless, I am sure that TensorFlow will make its way to the de-facto standard in neural network and deep learning development in research and practical applications. Many of our customers are already using TensorFlow or start developing projects that employ TensorFlow models. Also our data science consultants at STATWORX are heavily using TensorFlow for deep learning and neural net research and development. Let’s see what Google has planned for the future of TensorFlow. One thing that is missing, at least in my opinion, is a neat graphical user interface for designing and developing neural net architectures with TensorFlow backend. Maybe, this is something Google is already working on ;)

Final remarks

If you have any comments or questions on my story, feel free to comment below! I will try to answer them. Also, feel free to use my code or share this story with your peers on social platforms of your choice. Follow me on LinkedIn or Twitter, if you want to stay in touch.

Written by

Written by