Neural Networks for Forecasting Financial and Economic Time Series

A real-word application on Azure Deep Learning Virtual Machine

This week I attended The Data Science Conference in Chicago: this is an industry-focused yet academic conference, focusing on real-world solutions. Every presentation is an example of cutting edge work being done, and a great opportunity to get an in-depth update on the state of industry and learn best practices from other data scientists.

In my presentation, I shared a few insights on my latest research on “Neural Networks for Forecasting Financial and Economic Time Series”. Neural networks are a very comprehensive family of machine learning models and, in recent years, their applications in finance and economics have dramatically increased. However, this class of algorithms is not really familiar with the concept of a time axis.

The goal of this article is to provide a practical introductory guide to neural networks for forecasting financial time series data using Azure Deep Learning Virtual Machine. A multiple step approach to design a neural network forecasting model will be explained, including an application of stock market predictions with LSTM in Python.

Introduction to time series forecast

One of the most important elements of today’s decision-making world, in both the public and the private sectors, is the forecasting of macroeconomic and financial variables. During the past few decades, econometric model-based forecasting has become very popular in the private and the public decision-making process. In order to understand better the meaning of “Time series Forecast”, let’s split the term in two parts:

  1. Time series is a sequence of observations taken sequentially in time.
  2. Forecast means making predictions about a future event.

When forecasting is made on a time series data, such as events happening over a time interval, then it is called time series forecasting. Time series forecast is the process of predicting future events based on historical data.

Time series forecast has been in use across multiple industries for quite some time; it is commonly used in every industry to guide future decision, for example in retail sales forecast is very important, so that the raw material can be procured accordingly. The most famous example is weather forecasting, where based on the pattern in the past and recent changes, the future can be predicted. These predictions are very important and usually are the first step to solve other problem, as plan the power generations to avoid unnecessary power disruptions or overproduction.

In any forecast scenario, there are three questions that you always want to ask yourself before building the forecast model:

  • What is the time horizon required for my predictions?
  • What is the temporal frequency required for my predictions?
  • Can forecasts be updated frequently over time or should they produce only once and remain static over time?

The answers to these three questions will help you define the most critical components of time series, that are:

  • Trend: Long-term component, that defines gradual increase/decrease of your series.
  • Cycle: Long-term component, that defines the swings of the series.
  • Seasonality: Regular component, that observes the relatively short-term fluctuations of the series.
  • Error: Random variability in the observations that cannot be explained by the model.

Introduction to Neural Networks

Simple Exponential Smoothing and Autoregressive integrated moving average (ARIMA) are considered more traditional time series forecast models. However, in recent years, Neural Networks have become one of the most popular trends in machine learning and have applications to many areas, including driverless cars and robotics, speech and image recognition, financial forecasting.

Neural networks are a set of algorithms, that are designed to recognize patterns and deep learning is the name we use for “stacked neural networks”; that is, networks composed of several layers. The layers are made of nodes. A node is just a place where computation happens and combines input from the data with a set of coefficients, or weights, that either amplify or reduce that input.

These input-weight products are summed and the sum is passed through a node so-called activation function, to determine whether and to what extent that signal affects the ultimate outcome. This is Forward Propagation: we move from the input layer, to the hidden layer and finally to the output layer!

What are RNNs and LSTMs? Let’s Unroll!

The idea behind Recurrent Neural Networks (RNNs) is to make use of sequential information. In a traditional neural network we assume that all inputs are independent of each other. But for many tasks that is not an optimal idea. RNNs are called recurrent because they perform the same task for every element of a sequence, with the output being depended on the previous computations.

Another way to think about RNNs is that they have a “memory” which captures information about what has been calculated so far. In theory RNNs can make use of information in arbitrarily long sequences, but in practice they are limited to looking back only a few steps (more on this later).

LSTM networks are quite popular these days . LSTMs don’t have a fundamentally different architecture from RNNs, but they use a different function to compute the hidden state. An LSTM cell has 5 essential components which allows it to model both long-term and short-term data: the cell state, hidden state, input gate, forget gate and output gate. One critical advantage of LSTMs is their ability to remember from long-term sequences.

Stock market predictions with LSTMs: A real-word application on Azure Deep Learning Virtual Machine

LSTM models can use the history of a sequence of data and correctly predict what the future elements of the sequence are going to be. This is very helpful in many different cases for example when you need to model stock prices correctly.

For this specific scenario, I built my model using a Deep Learning Virtual Machine: deep learning requires large amount of computational power to train models with these large data sets. With the cloud and availability of Graphical Processing Units (GPUs), it is becoming possible to build sophisticated deep neural architectures and train them on a large data set on powerful computing infrastructure on the cloud.

The Deep Learning Virtual Machine is a specially configured variant of the Data Science Virtual Machine (DSVM) to make it more straightforward to use GPU-based VM instances for training deep learning models. Here you can find more information on how to get started and provision Data Science Virtual Machine on Azure.

For this real-word example I used a stock market data set from Kaggle with the following information:

  • Open: Opening stock price of the day
  • Close: Closing stock price of the day
  • High: Highest stock price of the day
  • Low: Lowest stock price of the day

In the rest of this article, I will describe a multiple step approach to design a neural network forecasting model. This approach can be summarized as follow:

The first step is to define hyperparameters:

Next you define placeholders for training inputs and labels:

Now you can define the parameters of the LSTM and regression layer:

For each batch of predictions and true outputs, you can calculate the Mean Squared Error. Finally, you define the optimizer you’re going to use to optimize the neural network.


Now you can train and predict stock price movements. The goal of this post was to provide a practical introductory guide to neural networks for forecasting financial time series data using Azure Deep Learning Virtual Machine. A multiple step approach to design a neural network forecasting model with LSTM in Python was also explained.