Toto 2.0: Time Series Forecasting Enters the Scaling Era

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

177K/year
🤖 AI Summary
This work addresses the lack of established scaling laws and extensible foundation models in time series forecasting by proposing a unified training framework that yields a family of foundation models ranging from 4M to 2.5B parameters. Through a consistent architecture, large-scale data, standardized training protocols, and an innovative u-muP hyperparameter transfer method, the study provides the first systematic empirical validation that model performance in time series forecasting consistently improves with scale. The resulting models achieve state-of-the-art results across three major benchmarks—BOOM, GIFT-Eval, and TIME—and five checkpoints are released under the Apache 2.0 license to support further research and reproducibility.
📝 Abstract
We show that time series foundation models scale: a single training recipe produces reliable forecast-quality improvements from 4M to 2.5B parameters. We release Toto 2.0, a family of five open-weights forecasting models trained under this recipe. The Toto 2.0 family sets a new state of the art on three forecasting benchmarks: BOOM, our observability benchmark; GIFT-Eval, the standard general-purpose benchmark; and the recent contamination-resistant TIME benchmark. This report describes our experimental results and details the design decisions behind Toto 2.0: its architecture and training recipe, training data, and the u-muP hyperparameter transfer pipeline. All five base checkpoints are released under Apache 2.0.
Problem

Research questions and friction points this paper is trying to address.

time series forecasting
foundation models
scaling
forecasting benchmarks
model generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

time series forecasting
foundation models
model scaling
u-muP hyperparameter transfer
open-weights models
E
Emaad Khwaja
Datadog AI Research
C
Chris Lettieri
Datadog AI Research
Gerald Woo
Gerald Woo
Senior Research Scientist, Datadog AI Research
Time SeriesMachine LearningDeep Learning
E
Eden Belouadah
Datadog AI Research
M
Marc Cenac
Datadog AI Research
G
Guillaume Jarry
Datadog AI Research
E
Enguerrand Paquin
Datadog AI Research
X
Xunyi Zhao
Datadog AI Research
V
Viktoriya Zhukov
Datadog AI Research
O
Othmane Abou-Amal
Datadog AI Research
C
Chenghao Liu
Datadog AI Research
Ameet Talwalkar
Ameet Talwalkar
CMU, Datadog
Machine Learning
D
David Asker
Datadog AI Research