Open Source Anomaly Detection Projects

Himanshu Mittal
3 min readMay 30, 2019

--

Past few weeks I have been spending time to build an anomaly detection service. Got a chance to research on the existing open-source projects.

In this blog, I would be focussing on well known open source projects that can be used for Anomaly Detection. The intention of this blog is to provide a glossary of existing projects. This is not an exhaustive list, just what I was able to find.

Descriptions for each project are picked from the official GitHub readme.

  • Twitter Anomaly Detection R Package
    AnomalyDetection is an open-source R package to detect anomalies which is robust, from a statistical standpoint, in the presence of seasonality and an underlying trend. The Anomaly Detection package can be used in a wide variety of contexts. For example, detecting anomalies in system metrics after a new software release, user engagement post an A/B test, or for problems in econometrics, financial engineering, political and social sciences.
    Github link: https://github.com/twitter/AnomalyDetection
    License: GNU General Public License v3.0
  • Yahoo EGADS Java Library:
    EGADS (Extensible Generic Anomaly Detection System) is an open-source Java package to automatically detect anomalies in large scale time-series data. EGADS is meant to be a library that contains a number of anomaly detection techniques applicable to many use-cases in a single package with the only dependency being Java. EGADS works by first building a time-series model which is used to compute the expected value at time t. Then a number of errors E are computed by comparing the expected value with the actual value at time t. EGADS automatically determines thresholds on E and outputs the most probable anomalies. EGADS library can be used in a wide variety of contexts to detect outliers and change points in time-series that can have various seasonal, trend and noise components.
    Github link: https://github.com/yahoo/egads
    License: GNU General Public License v3.0
  • Numenta HTM:
    The Numenta Platform for Intelligent Computing (NuPIC) is a machine intelligence platform that implements the HTM learning algorithms. HTM is a detailed computational theory of the neocortex. At the core of HTM are time-based continuous learning algorithms that store and recall spatial and temporal patterns. NuPIC is suited to a variety of problems, particularly anomaly detection and prediction of streaming data sources. For more information, see numenta.org or the NuPIC Forum.
    NAB: https://github.com/numenta/NAB
    Github link: https://github.com/numenta/nupic
    License: GNU Affero General Public License v3.0
  • datastream.io:
    An open-source framework for real-time anomaly detection using Python, Elasticsearch, and Kibana.
    Github link: https://github.com/MentatInnovations/datastream.io
    License: Apache License 2.0
  • Linkedin luminol: Luminol is a lightweight Python library for time series data analysis. The two major functionalities it supports are anomaly detection and correlation. It can be used to investigate possible causes of anomaly.
    Github link: https://github.com/linkedin/luminol
    License: Apache License 2.0
  • Telemanom: Telemanom employs vanilla LSTMs using Keras/Tensorflow to identify anomalies in multivariate sensor data. LSTMs are trained to learn normal system behaviors using encoded command information and prior telemetry values. Predictions are generated at each time step and the errors in predictions represent deviations from expected behavior. Telemanom then uses a novel nonparametric, unsupervised approach for thresholding these errors and identifying anomalous sequences of errors.
    Github link:https://github.com/khundman/telemanom
    License: Apache License 2.0
  • DeepADoTS : Repository of the paper “A Systematic Evaluation of Deep Anomaly Detection Methods for Time Series”.
    Github link:https://github.com/KDD-OpenSource/DeepADoTS
    License: Apache License 2.0
  • (Forecasting) Facebook Prophet:
    Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.
    Homepage:https://facebook.github.io/prophet
    Github link: https://github.com/facebook/prophet
    License: MIT License
  • Netflix Surus: Over the next year we plan to release a handful of our internal user-defined functions (UDFs) that have broad adoption across Netflix. The use cases for these functions are varied in nature (e.g. scoring predictive models, outlier detection, pattern matching, etc.) and together extend the analytical capabilities of big data.
    Github link: https://github.com/netflix/surus
    License: Apache License 2.0

Update: rob_med has a list of tools & datasets for anomaly detection on time-series data.
Link: https://github.com/rob-med/awesome-TS-anomaly-detection/

In case you know other projects, pls feel free to comment, will add it to the list. Hope this will help you get started !!

--

--