TimeSeriesGenerator: A Deep Down With Example in Python

Vivien Leonard
The Startup
Published in
3 min readJan 24, 2021
Photo by Mason Jones on Unsplash

What are TimeSeriesGenerator ?

TimeSeriesGenerator comes from the famous Keras module, that you can also find inside the Tensorflow module. They are use to easily handle time series. I don’t know why, they are not really used in the tons of example around the internet.

Just to be clear, you’ll be able to use them if you’re using deep learning method through, obviously, Tensorflow or Keras modules.

Let’s dive in. When working with time series, you went through preprocessing, and now, you want to implement a neural network to produce a forecast based on your time series. To stay clear, i’ll be using a simple univariate series.

When handling time series, you need to format it to feed the network. To do so, you can find people providing with already done method that does it for you. But almost every method i tried didn’t work for me, because of some type error, due to difference between version of numpy used by the people that created the method and the one i used at the time of the project, or other causes. That is why i decided to switch and look for some more official way of handling time series.

Let’s code a bit to show you how to use it in your project. First, my data :

train = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Here is my data. As i said, i’ll stay simple, and it’s what i used when i’m dealing with univariate time series.

TimeSeriesGenerator will basically embed your time series inside a TimeSeriesGenerator object that will later directly be feed inside your network.

And then we create our object :

generator = TimeseriesGenerator(train, train,stride=3, length=length, batch_size=batch_size)

Here is what you get when you use the method .to_json() on your TimeSeriesGenerator object :

{“class_name”: “TimeseriesGenerator”, “config”: {“data”: “[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]”, “targets”: “[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]”, “length”: 2, “sampling_rate”: 1, “stride”: 3, “start_index”: 2, “end_index”: 10, “shuffle”: true, “reverse”: true, “batch_size”: 1}}

As you can see, we find the data and other parameters inside this object. They are quite self explaining :

  • data and targets are basically what the network will use
  • length is the … length of each sample used to train our network
  • sampling_rate is the period between two outputs sub-series. In our case, with a sampling_rate of 1, each output sub-series will be spaced by one period.
  • stride is the period between each record in your original data. For instance, if you have a stride of 2, it means that between your first and second record, you’ll have 2 periods, like days. It means that you’re able to space your data.
  • start_index and end_index represent the range on which the different sample will be build. For instance, if your time series data has 15 records, and you fix start_index=1 and end_index=10, the 5 last records won’t be taking into the samples building.
  • shuffle=True means that we want our output sub-series to be in chronological order.
  • reverse=True means that our data will be in reverse chronological order
  • batch_size represent the number of output sub-series per batch. In our case, only one sub-series will be output per batch.

How to use TimeSeriesGenerator with RNN ?

The implementation is pretty straight forward :

from tensorflow.python.keras.preprocessing.sequence import TimeseriesGenerator
from tensorflow_core.python.keras import Sequential
from tensorflow_core.python.keras.layers import SimpleRNN, Dense

train = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# How many record to take into account
length = 2

batch_size = 1

# 1 beacause it's univariate
n_features = 1

generator = TimeseriesGenerator(train, train, stride=3, length=length, batch_size=batch_size)

model = Sequential()
model.add(SimpleRNN(100, activation='relu', input_shape=(length, n_features)))
model.add(Dense(100, activation='relu', input_shape=(length, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss="mse")
model.fit(generator)

As you can see, we just create our generator, then our network and finally we feed our generator into our network that will automatically handle the building and manage the different sub-series.

Conclusion

I hope this really short article will make you discover this object, which i think can be a real help when dealing with time series and trying to use deep learning technics.

--

--

Vivien Leonard
The Startup

PHD Student working on NLP and semantic web on Twitter data