Time series anomaly detection — in the era of deep learning

Part 2 of 3

MIT — Data to AI Lab
Data to AI Lab | MIT
11 min readAug 28, 2020

--

by Sarah Alnegheimish

In the previous post, we looked at time series data and anomalies. (If you haven’t done so already, you can read the article here.) In part 2, we will discuss time series reconstruction using generative adversarial networks (GAN)¹ and how reconstructing time series can be used for anomaly detection².

Time Series Anomaly Detection using Generative Adversarial Networks

Before we introduce our approach for anomaly detection (AD), let’s discuss one of today’s most interesting and popular models for deep learning: generative adversarial networks (GAN). The idea behind a GAN is that a generator (G), usually a neural network, attempts to construct a fake image by using random noise and fooling a discriminator (D) — also a neural network. (D)’s job is to identify “fake” examples from “real” ones. They compete with each other to be best at their job. How powerful is this approach? Well, the figure below depicts some fake images generated from a GAN.

Karras, Tero, Samuli Laine, and Timo Aila. “A style-based generator architecture for generative adversarial networks.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4401–4410. 2019.

In this project, we leverage the same approach for time series. We adopt a GAN structure to learn the patterns of signals from an observed set of data and train the generator “G”. We then use “G” to reconstruct time series data, and calculate the error by finding the discrepancies between the real and reconstructed signal. We then use this error to identify anomalies. You can read more about time series anomaly detection using GAN in our paper.

Enough talking — let’s look at some data.

Tutorial

In this tutorial, we will use a python library called Orion to perform anomaly detection. After following the instructions for installation available on github, we can get started and run the notebook. Alternatively, you can launch binder to directly access the notebook.

Load Data

In this tutorial, we continue examining the NYC taxi data maintained by Numenta. Their repository, available here, is full of AD approaches and labeled data, organized as a series of timestamps and corresponding values. Each timestamp corresponds to the time of observation in Unix Time Format.

--

--

MIT — Data to AI Lab
Data to AI Lab | MIT

Research lab at MIT focusing on developing data driven artificial intelligence applications.