Predicting anomalies using AutoAI Time Series API

4 min readApr 6, 2023

Detecting anomalies, or outliers, in a series of data points recorded over time is known as time series anomaly detection. It is a key issue in different industries, including finance, healthcare, cyber security and industrial manufacturing, where identifying anomalies can indicate fraud, system failure or health risks. Time series anomaly detection algorithms are specifically developed to automatically identify and flag such anomalies in data, allowing for timely and effective intervention.

IBM Watson AutoAI has recently introduced a new type of time series analysis — Anomaly Prediction. The typical use scenario of AutoAI Time Series Anomaly Prediction(TSAP) is that your historical time series data was not labeled but you know all historical data are normal without unusual or abnormal data points. You provide the historical data to TSAP and it utilizes multiple popular algorithms to train model-candiate pipelines to recognize the normal patterns in the data. You can then deploy a resulting model to predict anomalies in new data.

Let’s take a look at how easy it is to use the IBM AutoAI Python library to analyze the electricity usage for 18 months in order to detect potential anomalies in predicted usage for the next 6 months.

Setup

To work with AutoAI for time series you must have Watson Machine Learning service instance (included with the free plan) for IBM Cloud Pak for Data. Watson Machine Learning provides the Python interface via the ibm-watson-machine-learning package (available on pypi). Install the package by running the following pip command:

pip install ibm-watson-machine-learning

Next, provide authentication information to initialize the Python client.

from ibm_watson_machine_learning import APIClient
client = APIClient(credentials)

The time series data

Electricity usage data is a sample data set that contains the daily electricity consumption data of a fictional industry from January 1, 2020 through August 22, 2021.

Here is a visualization of this data set prepared using the plotly package.

To make the data available for the AutoAI experiment, we start by uploading the data to Cloud Object Storage.

AutoAI for time series

Using the Python API, we define the AutoAI experiment for time series data and specify the following parameters for our experiment’s optimizer:

· name - experiment name

· prediction_type – problem type

· timestamp_column_name – date&time column name/index

· feature_columns – names/indices of feature columns

· pipeline_types — specify pipelines by type

Now, call the fit() method to start the training job.

After training is completed, we can list all of the model-candidate pipelines generated by the AutoAI experiment.

Next, we can retrieve pipeline details by calling the get_pipeline_details() method. The following is an example of getting detailed metric information for the various anomaly types.

Deployment and scoring

In this section we will deploy the best pipeline as a webservice, or online deployment. Then we will use the scoring endpoint to obtain predictions for the next half year.

After the deployment is successfully created, we can ask for predictions using the score() method.

Furthermore, we can visualize the predictions with the labeled columns in the new data to better understand the performance of the selected pipeline. From the sequence chat below, we can see that all three extreme points in September and October 2021 are predicted as anomalies. Additionally, several individual points for January 2022 are also predicted to be anomalies.

Summary

This article demonstrated how to use the IBM Watson AutoAI Python API to code an experiment that analyzes known time series data so that the resulting model can predict possible anomalies in new data. To learn more about how to use this feature to derive insights from your time series data, try the feature out on IBM Cloud Pak for Data as a Service, or view sample AutoAI notebooks.

Go to IBM Cloud and check this new feature out.
You can also find sample AutoAI notebooks here.

P.S. There is also a blog for the general workflow of this feature in the GUI.

Acknowledgment

Thanks for Julianne Forgo's review and revision.

Predicting anomalies using AutoAI Time Series API

Setup

The time series data

AutoAI for time series

Acknowledgment

Written by Jun Wang