🤖 AI Summary
Zero-shot time-series forecasting suffers from inefficient long-range modeling, poor generalization, and limited reproducibility. To address these challenges, we propose the first fully parallelizable univariate foundation model based on linear RNNs. Our method introduces two key innovations: (i) a GatedDeltaProduct gating mechanism and (ii) a state-weaving mechanism—enabling fully parallel training and inference over sequences of arbitrary length for the first time. Furthermore, we design a high-fidelity synthetic data pipeline integrating stochastic differential equations, Gaussian processes, and audio generative models to significantly enhance zero-shot transfer capability. Evaluated on the Gift-Eval benchmark, our approach outperforms all purely synthetic pre-trained models and surpasses most models trained on real-world data, while achieving substantial gains in both training and inference efficiency. All code and data pipelines are publicly released.
📝 Abstract
Foundation models for zero-shot time series forecasting face challenges in efficient long-horizon prediction and reproducibility, with existing synthetic-only approaches underperforming on challenging benchmarks. This paper presents TempoPFN, a univariate time series foundation model based on linear Recurrent Neural Networks (RNNs) pre-trained exclusively on synthetic data. The model uses a GatedDeltaProduct architecture with state-weaving for fully parallelizable training across sequence lengths, eliminating the need for windowing or summarization techniques while maintaining robust temporal state-tracking. Our comprehensive synthetic data pipeline unifies diverse generators, including stochastic differential equations, Gaussian processes, and audio synthesis, with novel augmentations. In zero-shot evaluations on the Gift-Eval benchmark, TempoPFN achieves top-tier competitive performance, outperforming all existing synthetic-only approaches and surpassing the vast majority of models trained on real-world data, while being more efficient than existing baselines by leveraging fully parallelizable training and inference. We open-source our complete data generation pipeline and training code, providing a reproducible foundation for future research.