Strange things are afoot at Circle K

Fred Hoffman
bytehub-ai
2 min readDec 9, 2020

--

Solving the Machine Learning Time-Travel Problem

Sometimes trying to prepare data to build machine learning prototypes on historic data can feel like Bill and Ted trying pass a history test. Then after developing a promising model, deploying that model on future data can feel like damaging Rufus’ phone booth time machine on departure and ending up in the far future.

There are a number of time travel related challenges with developing and deploying machine learning models, particularly, when working with time series. All machine learning models use training data from the past to make predictions about future data. If data from the future leaks into our training sets our predictions won’t rock. To make matters worse when developing time series models we often need historical forecasts to avoid peeking into the future, but also historical actuals for validating our predictions. Engineering this into data pipelines can feel like being dragged through the Circuits of Time backwards.

Many teams struggle with a lack of data infrastructure designed with machine learning and the paradoxes of time travel in mind. However, you shouldn’t have to resort to kidnapping the great historical figures of past. There are only two things that a data scientist should need to specify to easily construct high quality training data:

  1. Where you want to time travel too (bold!)
  2. The features that you want to bring back (audacious!!)

That is where Bytehub’s data platform steps out of the phone box from the future.

>> import bytehub as bh
>> fs = bh.FeatureStore()

Based on the concept of a feature store, it makes time-series preparation that simple.

>> fs.get_timeseries('met-office.uk-temperature.forecast',
from_date='2020-11-01', to_date='2020-11-11', freq='60min',
time_travel='-6h')

Specify the feature or list of features that you want to bring back. In this case, the ByteHub feature store came preloaded with machine learning ready, historical forecasts from Met-Office weather models. We fetch historical temperature forecasts in a 10 day window sampled at 60 minute frequency. Then all that is left is to specify where you want to time travel too, in this case, 6 hours into the past. This returns the met-office view of the 10 day window 6 hours ago. Bodacious!

We think that time travelling feature stores with simple, easy-to-understand APIs will allow data scientists to be much more productive when tackling time-series problems and enable much easier implementation into production-ready systems. So next time you start a data science project invest in a feature store.

At ByteHub AI we build tools to make it easier to integrate time-series data with your ML applications and analytics. Get in touch if you’d like to know more about this topic.

--

--