An Introduction To Shapelets: The Shapes In Time Series

Ever wondered how a Fitbit or any gadget detects when you are walking or running and automatically detects every time you exercise? This is just one of the many applications of time series data.

Rohit Vincent
Version 1
4 min readApr 7, 2022

--

Photo by Yan Krukov from Pexels

Time series data is a collection of records obtained over time. This data would always have a sequence to it, and changing the order could produce or depict a completely different situation.

Currently, in the real world, the applications of time series are endless ranging from health care, human activity recognition, cyber-security, finance, marketing, automated disease detection, anomaly detection, etc. Due to the abundant availability of temporal data, there is a strong interest in applications based on time series, and many algorithms for classification have been proposed.

How do we classify Time Series?

Photo by Anna Nekrashevich from Pexels

There are many methods to classify time series data. Some of the standard well-known techniques use K-Nearest Neighbours with different elastic distance measures like Dynamic Time Warping (DTW), Time Warp Edit (TWE), or Complexity Invariant Distance (CID) to identify classes within the data.

We also have deep learning methods which show potential in time series forecasting through methods such as automatic learning of temporal dependence. However, due to the high dimensionality of time series data, these techniques prove expensive in terms of training time and memory requirements. Given the high computational burden using traditional algorithms, a concept known as Shapelets was proposed by Ye and Keogh.

What are shapelets?

As most time series data exhibits inter-class differences within sub-sequences rather than on the complete series, shapelets were meant to represent these discriminative sub-sequences of time-series data. In simple terms, we identify a shape within the series that distinguishes it from other classes in that domain. An example of a Shapelet is shown below.

Photo by Ye and Keogh from Time series shapelets: a new primitive for data mining

The above figure shows the time series one-dimensional representation of a leaf. The highlighted section shows the subsequence that best represents this leaf. There are different ways shapelets are identified with techniques that optimise discovery and classification time. Some of the well-known shapelet algorithms are Fast Shapelets and Learning Time-Series Shapelets.

Shapelet Implementations

Most shapelet implementations were done in C++ or Java, and there are no official implementations of these algorithms within the Python Standard Library. I am also currently working on a GitHub repository in python to identify shapelets and classify them. Some of the open-source python implementations of shapelets available right now are mentioned below:

Photo by Luis Gomes from Pexels

Learning Time-Series Shapelets by mohaseeb

Source: shaplets-python

Installation

Usage

Sktime by The Alan Turing Institute

Source: sktime

Sktime is a unified framework developed by the Alan Turing Institute for machine learning with time-series data. This package contains a shapelet transform, which can be used to extract shapelets from data.

Installation

or

Usage

Conclusion

Photo by luis gomes from Pexels

In this article, I have introduced shapelets in time series and their advantages over traditional methods. I intend to write my next article to provide an in-depth view of the algorithms to extract shapelets and how they can be used for classification problems.

Thank you very much for reading! Let me know if you have any questions or comments.

About the Author:

Rohit Vincent is a Data Analytics Consultant here at Version 1.

--

--