Time-series chaos? Use a feature store

Toby Coleman
Sep 14 · 3 min read

Feature stores are an important new concept in data science and machine-learning. They provide a way to organise much of the data preparation required when building a machine-learning model, and do it in a way that is repeatable and easy to deploy. They make data scientists more productive, and ease the path between research and production. We think they will be particularly useful when applied to time-series problems such as forecasting.

The bird’s-nest of code

Image for post
Image for post
ML models need to be fed well-organised features

Most data scientists have been there. A new project starts with neatly organised datasets and a set of scripts to train models and generate predictions from new data. As the project progresses, things get more complicated: extra datasets get bolted on, existing data modified and transformed, new ideas tested. Eventually we end up with a bird’s-nest of scripts and notebooks — and a massive task ahead of us if we want to turn this model into a production-ready system.

This is understandable: data science is an iterative process, and it is hard to know at the outset which datasets and features will be most useful. But this approach is problematic for three main reasons:

  1. They are hard to understand — pity the poor data scientist who has just joined the team and needs to understand how the model works;
  2. They are error-prone and difficult to debug; and
  3. It is difficult and time-consuming to turn the model into a real-world, production-ready system, because your engineering team will need to spend months understanding and implemented all of the different data transformations to feed data into the model.

Lack of data science tools

While a lot of attention has rightly gone into developing fantastic new algorithms and techniques for machine-learning, .

Recently, however, have emerged as new concept to help prepare data for use in machine learning. Rather than simply storing raw data in a database, a feature store is a system that organises data into pre-engineered features, ready to feed straight into a machine learning model.

Raw data usually needs to be transformed before it is fed into a machine-learning model. The transformed data are known as features.

Some key advantages of this approach are:

  1. The code to compute features is organised and run on the feature store, meaning that this logic is kept separate from ML code;
  2. The same features can be served either in batches (for training) or in real-time (for production ML models), making it easier to deploy ML systems;
  3. Features can also be shared across different teams in an organisation, meaning that the effort spent on feature engineering can be re-used across multiple ML models.

Time-series features

We think feature stores will be particularly useful when working with time-series data — everything from financial market prices and energy demand to weather forecasts and retail sales information. This is because time series present some unique difficulties when trying to implement a robust machine-learning system. For example:

  • The — how do you make sure that you only train your model on data that was available at the time a prediction would have been made? This crops up a lot when handling weather data, where historical forecasts can be difficult to handle in a reliable way.
  • Complex feature engineering — for example, rolling windows of past data are often required for time-series forecasting models. Keeping this code separate from the machine learning model helps make it more maintainable and reusable.

We think that feature stores with simple, easy-to-understand APIs will allow data scientists to be much more productive when tackling time-series problems. For engineers, these models will be much easier to implement into production-ready systems.

At we build tools to make it easier to integrate time-series data with your ML applications and analytics. if you’d like to know more about this topic.

Feature Stores for ML

AI, Data, and everything in between

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface.

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox.

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store