Implementing a Transformer-Based Time Series Predictor

Easy Time Series Analysis Using Intel Tiber Developer Cloud

Published in

Intel Analytics Software

5 min readJun 25, 2024

I’ve been working with several clients lately who all seemed to be focused on time series analysis. This was a great excuse to explore current and older methods of time series prediction and clustering. I decided to break this experimentation into two articles: this one, on time series prediction, and another on a time series clustering trick using principal components analysis and DBSCAN.

Clustering Time Series with PCA and DBSCAN

Accelerating These Algorithms Using Intel Extension for Scikit-learn

medium.com

A Google search on Hugging Face community indicates there is a model called Chronos for this, but the GitHub location was missing from the model card, so a little more searching lead me to a nice starting example from amazon-science. An explanation of Chronos and AutoGluon are beyond the scope of this article, but I encourage you to study the references. The concise description of what AutoGluon does, from their webpage, is:

Via a simple fit() call, AutoGluon can train and tune
- simple forecasting models (e.g., ARIMA, ETS, Theta),
- powerful deep learning models (e.g., DeepAR, Temporal Fusion Transformer),
- tree-based models (e.g., LightGBM),
- an ensemble that combines predictions of other models

to produce multi-step ahead probabilistic forecasts for univariate time series data.

Intel Tiber Developer Cloud

I decided to showcase my example using a free account on Intel Tiber Developer Cloud, which is an awesome and powerful sandbox for exploration. Below are the steps to get my initial project working. I hope to follow-up on this example in a future article showing how to quantize the pipeline and perhaps run it on Intel Data Center GPU Max Series. For now, the example uses a 4th Gen Intel Xeon Scalable Processor. I already have an account on the Intel Tiber Developer Cloud, but you can enroll for free at the link below:

https://www.intel.com/content/www/us/en/developer/tools/devcloud/services.html

The next step was to launch my Jupyter Hub instance on Tiber by selecting “Training” in the lefthand menu panel:

Next, I launched my Jupyter instance by clicking the “Launch JupyterLab” button in the upper right of the Training page:

Upon login, I am greeted with a JupyterLab interface:

I added a folder called “Unsupervised” followed by a subfolder called “Chronos” as shown below:

From the Launcher, I created a new notebook that uses the preconfigured “PyTorch GPU” kernel. The code for the project can be found in my git repository: ChronosTimeSeriesPredictor. Launch a terminal session to run the Git commands:

This will create a browser tab with your Bash shell terminal:

Simply mkdir Unsupervised, then git clone the repository in the directory, as follows:

mkdir Unsupervised
cd Unsupervised/
git clone https://github.com/zmadscientist/ChronosTimeSeriesPredictor.git

Create a New Notebook for the Analysis

I added the following text cell to describe the purpose of the notebook:

# Time Series Prediction

Note:
Stop all kernels and besure to select the PyTorch GPU kernel
This data used in this exercise was synthesized after reconstructing and digitizing an electrcial grid plot from Toronto-Hydro-Transformer-Monitors then finding the mean and standard deviation for every time slice in the contructed graph for several days worth of data. This proviided a gaussian distribution that could be sampled from to generate an infinite number of datas worth of data. Several such days of synthetic data have been saved in anomaly.csv.

We use this data as a starting point to generate new data in the desired Chronos format: anomalySeries.csv

You can follow the tutorial from Article: Forecasting with Chronos which was of great help getting started

https://auto.gluon.ai/stable/tutorials/timeseries/forecasting-chronos.html

Retrieve the Data and Install the Libraries

I will be using time series data I generated for another project that is based on electrical grid data I found online from Toronto Hydro titled, Toronto-Hydro-Transformer-Monitors.

Install the necessary libraries using pip. The key libraries are torch, torchvision, autogluon, and chronos. I included these lines in a Jupyter cell, simply uncomment and run the cell. The installation will take several minutes!

# !pip install -U pip
# !pip install -U setuptools wheel
# # CPU version of pytorch has smaller footprint - see installation instructions in
# #pytorch documentation - https://pytorch.org/get-started/locally/
# ! pip install torch==2.3.1 torchvision==0.18.1 - index-url https://download.pytorch.org/whl/cpu
# ! pip install autogluon
# !pip install git+https://github.com/amazon-science/chronos-forecasting.git

Finally, run this cell to import the modules:

from autogluon.timeseries import TimeSeriesDataFrame, TimeSeriesPredictor
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
from chronos import ChronosPipeline
from tqdm.auto import tqdm

Explore the Data

Notice the regular daily periodicity and the somewhat irregular nature of the electric grid values bounded and constrained within a statistical envelope for each day. This data was generated by tracking a mean and standard deviation for every ten-minute interval each day for multiple days. I modeled each time slice as having its own mean and standard deviation and assumed a Gaussian distribution that I later sample to generate an infinite amount of data:

import matplotlib.pyplot as plt 
import pandas as pd
df = pd.read_csv("anomalySeries.csv")
plt.plot( df.target[-500:])
plt.grid()
plt.show()

Prepare the Chronos Pipeline Parameters

pipeline = ChronosPipeline.from_pretrained(
  "amazon/chronos-t5-base",
  device_map="cpu", 
  torch_dtype=torch.bfloat16,
)

Read the Time Series

# Put into TimeSeries format
import matplotlib.pyplot as plt
data = TimeSeriesDataFrame("anomalySeries.csv")

Specify a prediction length and fit a Chronos predictor using the Chronos_tiny model:

prediction_length = 24
train_data, test_data = data.train_test_split(prediction_length)

predictor = TimeSeriesPredictor(prediction_length=prediction_length).fit(
    train_data, presets="chronos_tiny",
)

Fit the Training Data

predictor = TimeSeriesPredictor(prediction_length=prediction_length).fit(
    train_data,
    hyperparameters={
        "Chronos": {
            "model_path": "tiny",
            "batch_size": 64,
            "device": "cpu",
        }
    },
    skip_model_selection=True,
    verbosity=0,
)

Predict the Test Data

predictions = predictor.predict(test_data)
Len = data.shape[0]
predictor.plot(
    data=data, 
    predictions=predictions, 
    item_ids=["Series"],
    max_history_length=Len,
)

Prediction (orange line) for Test Data by Chronos

Summary

You can see how easy it is to obtain a free account on the Intel Tiber Developer Cloud and use the Chronos model to quickly train and predict time series. I encourage you to give it a try!