A Quick Example of Time-Series Prediction Using Long Short-Term Memory (LSTM) Networks

Published in

The Startup

4 min readAug 2, 2019

The code in the post can be found at https://github.com/gianfelton/12-Month-Forecast-With-LSTM

After seeing a lot of posts where predictions were plotted against test sets (my posts included), I wanted to do a quick demo of actually predicting beyond the time-frame of a dataset. (Although it isn’t shown, RMSE was used to tune parameters.)

Part 1: Building the Model and Comparing Against the Test Set

Let’s start with our imports.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from statsmodels.tools.eval_measures import rmse
from sklearn.preprocessing import MinMaxScaler
from keras.preprocessing.sequence import TimeseriesGenerator
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
import warnings
warnings.filterwarnings("ignore")

df = pd.read_csv('AirPassengers.csv')

The next thing we want to do is put the month column in the index.

df.Month = pd.to_datetime(df.Month)
df = df.set_index("Month")

With that finished, we can split our data between the training and testing sets.

train, test = df[:-12], df[-12:]

We’ll need to scale our data.

scaler = MinMaxScaler()
scaler.fit(train)
train = scaler.transform(train)
test = scaler.transform(test)

From here, we can create and fit our model.

n_input = 12
n_features = 1
generator = TimeseriesGenerator(train, train, length=n_input, batch_size=6)model = Sequential()
model.add(LSTM(200, activation='relu', input_shape=(n_input, n_features)))
model.add(Dropout(0.15))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')model.fit_generator(generator,epochs=90)

I got the technique below from Caner Dabakoglu here on Medium. In it we are doing a few things:

create an empty list for each of our 12 predictions
create the batch that our model will predict off of
save the prediction to our list
add the prediction to the end of the batch to be used in the next prediction

pred_list = []

batch = train[-n_input:].reshape((1, n_input, n_features))

for i in range(n_input):   
    pred_list.append(model.predict(batch)[0]) 
    batch = np.append(batch[:,1:,:],[[pred_list[i]]],axis=1)

Now that we have our list of predictions, we need to reverse the scaling we did in the beginning. The code is also creating a dataframe out of the prediction list, which is concatenated with the original dataframe. I did this for plotting. There are many other (better) ways to do this.

df_predict = pd.DataFrame(scaler.inverse_transform(pred_list),                           index=df[-n_input:].index, columns=['Prediction'])df_test = pd.concat([df,df_predict], axis=1)

Next, we plot the predictions against the actuals.

plt.figure(figsize=(20, 5))
plt.plot(df_test.index, df_test['AirPassengers'])
plt.plot(df_test.index, df_test['Prediction'], color='r')
plt.legend(loc='best', fontsize='xx-large')
plt.xticks(fontsize=18)
plt.yticks(fontsize=16)
plt.show()

This is good, but what we really need is the ability to predict beyond the time-frame of the dataset. The following code works through this. It is mainly the same code, except where future dates are added on.

Part 2: Predicting Beyond the Dataset

train = dfscaler.fit(train)
train = scaler.transform(train)n_input = 12
n_features = 1
generator = TimeseriesGenerator(train, train, length=n_input, batch_size=6)model.fit_generator(generator,epochs=90)pred_list = []  batch = train[-n_input:].reshape((1, n_input, n_features))for i in range(n_input):
    pred_list.append(model.predict(batch)[0])      
    batch = np.append(batch[:,1:,:],[[pred_list[i]]],axis=1)

Here, we create our new dates for the next 12 months.

from pandas.tseries.offsets import DateOffset
add_dates = [df.index[-1] + DateOffset(months=x) for x in range(0,13) ]
future_dates = pd.DataFrame(index=add_dates[1:],columns=df.columns)

The following code is the same, except for the index being set to the future dates.

df_predict = pd.DataFrame(scaler.inverse_transform(pred_list),
                          index=future_dates[-n_input:].index, columns=['Prediction'])

df_proj = pd.concat([df,df_predict], axis=1)

And now we can check out the results.

plt.figure(figsize=(20, 5))
plt.plot(df_proj.index, df_proj['AirPassengers'])
plt.plot(df_proj.index, df_proj['Prediction'], color='r')
plt.legend(loc='best', fontsize='xx-large')
plt.xticks(fontsize=18)
plt.yticks(fontsize=16)
plt.show()

References:

How to Use the TimeseriesGenerator for Time Series Forecasting in Keras

Time series data must be transformed into a structure of samples with input and output components before it can be used…

machinelearningmastery.com

How to Develop LSTM Models for Time Series Forecasting

Long Short-Term Memory networks, or LSTMs for short, can be applied to time series forecasting. There are many types of…