Moirai 2.0: When Less Is More for Time Series Forecasting

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the longstanding trade-off among accuracy, efficiency, and model size in probabilistic time series forecasting, this paper introduces Moirai 2.0—the first decoder-only foundation model pretrained on 36 million real-world time series. It innovatively adopts single-segment input, quantile tokenization for output representation, and a recursive multi-quantile decoding mechanism—eliminating masked encoders and hybrid distribution modeling while enabling end-to-end optimization via quantile loss. Compared to prior foundation models, Moirai 2.0 reduces parameter count by 30× and accelerates inference by 2×, yet achieves state-of-the-art performance across benchmarks including Gift-Eval. Extensive experiments demonstrate strong generalization and robustness across diverse domains—energy, finance, and IoT—validating, for the first time, the superiority of lightweight decoder-only architectures in large-scale time series foundation modeling.

Technology Category

Application Category

📝 Abstract
We introduce Moirai 2.0, a decoder-only time-series foundation model trained on a new corpus of 36M series. The model adopts quantile forecasting and multi-token prediction, improving both probabilistic accuracy and inference efficiency. On the Gift-Eval benchmark, it ranks among the top pretrained models while achieving a strong trade-off between accuracy, speed, and model size. Compared to Moirai 1.0, Moirai 2.0 replaces masked-encoder training, multi-patch inputs, and mixture-distribution outputs with a simpler decoder-only architecture, single patch, and quantile loss. Ablation studies isolate these changes -- showing that the decoder-only backbone along with recursive multi-quantile decoding contribute most to the gains. Additional experiments show that Moirai 2.0 outperforms larger models from the same family and exhibits robust domain-level results. In terms of efficiency and model size, Moirai 2.0 is twice as fast and thirty times smaller than its prior best version, Moirai 1.0-Large, while also performing better. Model performance plateaus with increasing parameter count and declines at longer horizons, motivating future work on data scaling and long-horizon modeling. We release code and evaluation details to support further research.
Problem

Research questions and friction points this paper is trying to address.

Develops efficient time series forecasting with simplified architecture
Improves probabilistic accuracy using quantile forecasting techniques
Optimizes trade-off between model size, speed, and prediction performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoder-only architecture for time series forecasting
Quantile forecasting with multi-token prediction
Single patch input replacing multi-patch inputs
🔎 Similar Papers
No similar papers found.