Time Lord: A look at Amazon Chronos

Will G

Published in

One Cool Thing

3 min readJun 4, 2024

Code here. Paper here. Amazon blog post here.

Executive Summary for Managers and Leaders:

What is it?: A pre-trained, Deep Learning-based model that forecasts time series data (even without training!). In some cases, Chronos outperforms models that were specifically trained on a type of time series. It achieves this “zero shot” performance by using similar concepts to the technology that undergirds most Large Language Models: extensive, large-scale exposure to data, and treating the time series as a sequence of “tokens.”

Why should you care?: There’s a lot of practical (and important) data out there that can be characterized as a time series, such as:

Daily users
Passengers per week
Hourly Temperature
Daily production

Forecasts typically rely on careful, artful tuning by hand and context. Chronos appears to offer a lot of power straight out of the box for an initial forecast without either of the above. Additionally, if you don’t have much of the time series data that you care about, Chronos might still do a pretty good job of forecasting because of its substantial pre-training. Depending on what is “good enough” for your needs, it might be all that you need.

What questions should you be asking your DS/ML folks?:

How much data do we have for forecasting for this problem?
What methods have you tried for forecasting so far?
How are you assessing the quality of your forecasts? (Don’t say “accuracy,” since that means something specific to data scientists)
How often do your forecasts over/underestimate the actual values we’ve seen before?
Time series forecasts are not interpretive. What do you think the forecast trend could be indicating?

Summary for Data Scientists/ML Engineers/The-Technically-Curious:

What is it?: A pre-trained, T5 Transformer-based architecture that has been trained on 55 publicly-available time series datasets. These real datasets were also augmented by adaptation of the Mixup algorithm that was originally developed for image classification and a kernel synthesis technique.

What is cool about it?: Treating a time series as a series of sequential tokens — as has been done with language — is a unique, interesting framing of the problem that ports back advancements from Natural Language Processing into the field that initially led to breakthroughs in NLP. The TSMixup augmentation method may be useful and interesting on its own with traditional time series forecasting methods in cases where available time series are sparse and when adoption of Chronos is not viable.

Questions I am thinking about:

When would we opt to use a pre-trained, Chronos-like model rather than a “traditional” time series forecasting approach?
What biases may have been encoded into Chronos that are not representative of all time series and how could they be mitigated
Typically, a forecast is just another data point; it is not prescriptive of action to take or a decision to make. Frequently, time series forecasts get fed into other ML models as features for prediction. Once Chronos has generated a forecast, what next?

Time Lord: A look at Amazon Chronos

Executive Summary for Managers and Leaders:

Summary for Data Scientists/ML Engineers/The-Technically-Curious:

Written by Will G