Timer-XL: Long-Context Transformers for Unified Time Series Forecasting

📅 2024-10-07

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This paper addresses the unified challenge of modeling non-stationary univariate series, strongly coupled multivariate series, and exogenous covariates in multidimensional time series forecasting. We propose Timer-XL—the first causal decoder-only Transformer foundation model designed for general-purpose time series prediction. Our method introduces: (1) a novel multivariate next-token prediction paradigm, naturally extending univariate modeling to long-context, joint multivariate forecasting; (2) TimeAttention—a causally constrained attention mechanism—paired with position encodings that enforce both temporal causality and variable equivalence; and (3) patch-based representation learning combined with large-scale pretraining to enable zero-shot cross-task generalization. Timer-XL achieves state-of-the-art performance across all benchmark categories—including univariate, multivariate, and covariate-augmented forecasting—and provides the first empirical validation of strong generalization capability in pretrained time series foundation models.

Technology Category

Application Category

📝 Abstract

We present Timer-XL, a causal Transformer for unified time series forecasting. To uniformly predict multidimensional time series, we generalize next token prediction, predominantly adopted for 1D token sequences, to multivariate next token prediction. The paradigm formulates various forecasting tasks as a long-context prediction problem. We opt for decoder-only Transformers that capture causal dependencies from varying-length contexts for unified forecasting, making predictions on non-stationary univariate time series, multivariate series with complicated dynamics and correlations, as well as covariate-informed contexts that include exogenous variables. Technically, we propose a universal TimeAttention to capture fine-grained intra- and inter-series dependencies of flattened time series tokens (patches), which is further enhanced by deft position embedding for temporal causality and variable equivalence. Timer-XL achieves state-of-the-art performance across task-specific forecasting benchmarks through a unified approach. Based on large-scale pre-training, Timer-XL achieves state-of-the-art zero-shot performance, making it a promising architecture for pre-trained time series models. Code is available at this repository: https://github.com/thuml/Timer-XL.

Problem

Research questions and friction points this paper is trying to address.

Unified forecasting of multidimensional time series using causal Transformers.

Generalizing next token prediction for multivariate time series forecasting.

Capturing fine-grained dependencies in non-stationary and covariate-informed time series.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalizes next token prediction for multivariate series

Uses decoder-only Transformers for unified forecasting

Introduces TimeAttention for intra- and inter-series dependencies

🔎 Similar Papers

No similar papers found.