Timeseries forecasting using LSTM

Dipanwita Mallick
Analytics Vidhya
Published in
5 min readDec 17, 2020

LSTM(long short-term memory networks) is a variant of RNN(Recurrent neural network), capable of learning long-term dependencies, especially in sequence prediction problems.

Here I am going to use LSTM to demonstrate how we can use this technique to do some time series forecasting.

image source

Data: https://www.kaggle.com/uciml/electric-power-consumption-data-set

data.head() output

Data preprocessing:

#Derive a column as Date from Date and Time 
data.loc[:,'Date'] = pd.to_datetime(data.Date.astype(str)+' '+data.Time.astype(str))
#Drop the column time
data.drop(["Time"],inplace=True,axis =1)
#Set the Date column as index
data.set_index(["Date"],inplace=True)
#change the series to dataframe
data = data[["Global_active_power"]]
#check if there is any unwanted characters in the column
print (data[pd.to_numeric(data['Global_active_power'], errors='coerce').isnull()]["Global_active_power"].unique())
#remove the character (in this case it is '?')
data["Global_active_power"] = data["Global_active_power"].apply(lambda x: float(x) if "?" not in x else None)

The data is available every minute, so we can resample our data by day, month, or hour. I will choose to resample by hour otherwise since by month will reduce the number of data points. The prediction using day wasn’t giving any interesting forecasting result so I chose the hour instead of day for resampling our dataset.

data_sampled_hr=data["Global_active_power"].resample('H').mean().iloc[1: , ]
data_sampled_hr = pd.DataFrame(data_sampled_hr)
data_sampled_hr.head()
#Let's fill the nans with 0 and visualiza the data
data_sampled_hr = data_sampled_hr.fillna(0)

The visualization doesn’t help in understanding if there is any seasonality or trend. So, let's decompose the time series using seasonal decomposition.

from statsmodels.tsa.seasonal import seasonal_decompose 
results = seasonal_decompose(data_sampled_hr)
results.seasonal[:1000].plot(figsize = (12,8));

Let’s do some ACF to see how the time series correlates with its past values

from statsmodels.graphics.tsaplots import plot_acf
plot_acf(data_sampled_hr)
plt.show();

This is interesting! See how the correlation is high in the first hours of a day and then again rises in the late hours of the day. There is definitely a trend.

Model building and evaluation:

Let’s see if the LSTM model can make some predictions or understand the general trend of the data.

For forecasting what we can do is use 48 hours (2 days) time window to make a prediction in the future. Let’s design the training and test data.

train = data_sampled_hr[:-48]
test = data_sampled_hr[-48:] # last 48 hours is my test data

Now, we need our data to be scaled, which is imperative for any deep learning model.

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
scaler.fit(train)
scaled_train = scaler.transform(train)
scaled_test = scaler.transform(test)

Now we are going to use the time series generator from the Keras library to design the training data and the label, meaning the generator will use 48 data points from the start and map the 49th data as the label, then the next 48 points leaving the first, and 50th data as the label and so on.

#Time series generator
from keras.preprocessing.sequence import TimeseriesGenerator
#define generator
#I have used batch_size as 10 so that it's faster, one can use 1 as well
n_input = 48
n_features = 1
generator = TimeseriesGenerator(scaled_train,scaled_train,length = n_input, batch_size = 10)#Note: both the parameters of TimeseriesGenerator are scaled_train #because to generate the data and the label it will use scaled_train

Now let’s define the model,

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM

# define model

model = Sequential()
model.add(LSTM(200,activation= "relu" , input_shape = (n_input , n_features)))
model.add(Dense(1))
model.compile(optimizer = "adam" , loss="mse")
model.summary()
model.fit_generator(generator , epochs=5)

Notice how I have used epoch as 5, one can use more number of epochs to see how the model performs.

To see how the loss varied with the epoch, we can make a quick plot:

loss_per_epoch = model.history.history["loss"]
plt.plot(range(len(loss_per_epoch)), loss_per_epoch);

Now, let’s see how it performs on the test data.

first_eval_batch = scaled_train[-48:]
first_eval_batch = first_eval_batch.reshape((1, n_input, n_features))
test_predictions = []

first_eval_batch = scaled_train[-n_input:]
current_batch = first_eval_batch.reshape((1, n_input, n_features))

for i in range(len(test)):
# 2d to 1d conversion
current_pred = model.predict(current_batch)[0]
#store the prediction
test_predictions.append(current_pred)
#update batch to now include prediction and drop first value
current_batch = np.append(current_batch[:, 1:, :], [[current_pred]], axis=1)

test_predictions = scaler.inverse_transform(test_predictions)
test["pred"] = test_predictions
test.head()

Plot the predictions:

test.plot(figsize = (12,8));

Although the predictions are not exactly perfect, you can see how the model is able to pick the trend.

We can now vary the number of epochs, change the time window, meaning instead of 48 hours, try with 96 hours or 24 hours to see if the model is able to make accurate predictions. We can also perform some other resampling modes to experiment with the dataset.

Finally, instead of one LSTM layer, we can use multiple layers too, to experiment further.

This is a small effort to demonstrate how easily we can use LTSM model to forecast time series. I hope this is helpful and if you notice any area of improvement then feel free to leave a note. Thank you for reading the article !!!

--

--

Dipanwita Mallick
Analytics Vidhya

I am working as a Senior Data Scientist at Hewlett Packard Enterprise. I love exploring new ideas and new places !! :)