Time series anomaly detection — in the era of deep learning
by Sarah Alnegheimish
In the previous post, we looked at time series data and anomalies. (If you haven’t done so already, you can read the article here.) In part 2, we will discuss time series reconstruction using generative adversarial networks (GAN)¹ and how reconstructing time series can be used for anomaly detection².
Time Series Anomaly Detection using Generative Adversarial Networks
Before we introduce our approach for anomaly detection (AD), let’s discuss one of today’s most interesting and popular models for deep learning: generative adversarial networks (GAN). The idea behind a GAN is that a generator (G), usually a neural network, attempts to construct a fake image by using random noise and fooling a discriminator (D) — also a neural network. (D)’s job is to identify “fake” examples from “real” ones. They compete with each other to be best at their job. How powerful is this approach? Well, the figure below depicts some fake images generated from a GAN.
In this project, we leverage the same approach for time series. We adopt a GAN structure to learn the patterns of signals from an observed set of data and train the generator “G”. We then use “G” to reconstruct time series data, and calculate the error by finding the discrepancies between the real and reconstructed signal. We then use this error to identify anomalies. You can read more about time series anomaly detection using GAN in our paper.
Enough talking — let’s look at some data.
In this tutorial, we will use a python library called Orion to perform anomaly detection. After following the instructions for installation available on github, we can get started and run the notebook. Alternatively, you can launch binder to directly access the notebook.
In this tutorial, we continue examining the NYC taxi data maintained by Numenta. Their repository, available here, is full of AD approaches and labeled data, organized as a series of timestamps and corresponding values. Each timestamp corresponds to the time of observation in Unix Time Format.