A Quick Example of Time-Series Prediction Using Long Short-Term Memory (LSTM) Networks

Ian Felton
The Startup
Published in
4 min readAug 2, 2019

The code in the post can be found at https://github.com/gianfelton/12-Month-Forecast-With-LSTM

After seeing a lot of posts where predictions were plotted against test sets (my posts included), I wanted to do a quick demo of actually predicting beyond the time-frame of a dataset. (Although it isn’t shown, RMSE was used to tune parameters.)

Part 1: Building the Model and Comparing Against the Test Set

Let’s start with our imports.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from statsmodels.tools.eval_measures import rmse
from sklearn.preprocessing import MinMaxScaler
from keras.preprocessing.sequence import TimeseriesGenerator
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
import warnings
warnings.filterwarnings("ignore")

df = pd.read_csv('AirPassengers.csv')

The next thing we want to do is put the month column in the index.

df.Month = pd.to_datetime(df.Month)
df = df.set_index("Month")

With that finished, we can split our data between the training and testing sets.

train, test = df[:-12], df[-12:]

We’ll need to scale our data.

scaler = MinMaxScaler()
scaler.fit(train)
train = scaler.transform(train)
test = scaler.transform(test)

From here, we can create and fit our model.

n_input = 12
n_features = 1
generator = TimeseriesGenerator(train, train, length=n_input, batch_size=6)
model = Sequential()
model.add(LSTM(200, activation='relu', input_shape=(n_input, n_features)))
model.add(Dropout(0.15))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit_generator(generator,epochs=90)

I got the technique below from Caner Dabakoglu here on Medium. In it we are doing a few things:

  • create an empty list for each of our 12 predictions
  • create the batch that our model will predict off of
  • save the prediction to our list
  • add the prediction to the end of the batch to be used in the next prediction
pred_list = []

batch = train[-n_input:].reshape((1, n_input, n_features))

for i in range(n_input):
pred_list.append(model.predict(batch)[0])
batch = np.append(batch[:,1:,:],[[pred_list[i]]],axis=1)

Now that we have our list of predictions, we need to reverse the scaling we did in the beginning. The code is also creating a dataframe out of the prediction list, which is concatenated with the original dataframe. I did this for plotting. There are many other (better) ways to do this.

df_predict = pd.DataFrame(scaler.inverse_transform(pred_list),                           index=df[-n_input:].index, columns=['Prediction'])df_test = pd.concat([df,df_predict], axis=1)

Next, we plot the predictions against the actuals.

plt.figure(figsize=(20, 5))
plt.plot(df_test.index, df_test['AirPassengers'])
plt.plot(df_test.index, df_test['Prediction'], color='r')
plt.legend(loc='best', fontsize='xx-large')
plt.xticks(fontsize=18)
plt.yticks(fontsize=16)
plt.show()

This is good, but what we really need is the ability to predict beyond the time-frame of the dataset. The following code works through this. It is mainly the same code, except where future dates are added on.

Part 2: Predicting Beyond the Dataset

train = dfscaler.fit(train)
train = scaler.transform(train)
n_input = 12
n_features = 1
generator = TimeseriesGenerator(train, train, length=n_input, batch_size=6)
model.fit_generator(generator,epochs=90)pred_list = [] batch = train[-n_input:].reshape((1, n_input, n_features))for i in range(n_input):
pred_list.append(model.predict(batch)[0])
batch = np.append(batch[:,1:,:],[[pred_list[i]]],axis=1)

Here, we create our new dates for the next 12 months.

from pandas.tseries.offsets import DateOffset
add_dates = [df.index[-1] + DateOffset(months=x) for x in range(0,13) ]
future_dates = pd.DataFrame(index=add_dates[1:],columns=df.columns)

The following code is the same, except for the index being set to the future dates.

df_predict = pd.DataFrame(scaler.inverse_transform(pred_list),
index=future_dates[-n_input:].index, columns=['Prediction'])

df_proj = pd.concat([df,df_predict], axis=1)

And now we can check out the results.

plt.figure(figsize=(20, 5))
plt.plot(df_proj.index, df_proj['AirPassengers'])
plt.plot(df_proj.index, df_proj['Prediction'], color='r')
plt.legend(loc='best', fontsize='xx-large')
plt.xticks(fontsize=18)
plt.yticks(fontsize=16)
plt.show()

References:

--

--