Delayformer: spatiotemporal transformation for predicting high-dimensional dynamics

📅 2025-06-13

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Addressing multivariate time-series forecasting for high-dimensional nonlinear dynamical systems under small-sample and noisy conditions, this paper proposes Delayformer. Methodologically, it pioneers the integration of delay embedding theory with vision Transformers, introducing a multivariate spatiotemporal information (mvSTI) transformation that maps delayed sequences of individual variables into joint state vectors—enabling system-level, rather than univariate, modeling to mitigate nonlinearity and inter-variable coupling. The architecture employs a shared ViT encoder, multi-head self-attention, and variable-specific linear decoders. Extensive experiments demonstrate that Delayformer significantly outperforms state-of-the-art methods on both synthetic and real-world benchmarks. Moreover, it exhibits strong cross-domain generalization across meteorological, traffic, and financial forecasting tasks, validating its potential as a foundational time-series modeling framework.

Technology Category

Application Category

📝 Abstract

Predicting time-series is of great importance in various scientific and engineering fields. However, in the context of limited and noisy data, accurately predicting dynamics of all variables in a high-dimensional system is a challenging task due to their nonlinearity and also complex interactions. Current methods including deep learning approaches often perform poorly for real-world systems under such circumstances. This study introduces the Delayformer framework for simultaneously predicting dynamics of all variables, by developing a novel multivariate spatiotemporal information (mvSTI) transformation that makes each observed variable into a delay-embedded state (vector) and further cross-learns those states from different variables. From dynamical systems viewpoint, Delayformer predicts system states rather than individual variables, thus theoretically and computationally overcoming such nonlinearity and cross-interaction problems. Specifically, it first utilizes a single shared Visual Transformer (ViT) encoder to cross-represent dynamical states from observed variables in a delay embedded form and then employs distinct linear decoders for predicting next states, i.e. equivalently predicting all original variables parallelly. By leveraging the theoretical foundations of delay embedding theory and the representational capabilities of Transformers, Delayformer outperforms current state-of-the-art methods in forecasting tasks on both synthetic and real-world datasets. Furthermore, the potential of Delayformer as a foundational time-series model is demonstrated through cross-domain forecasting tasks, highlighting its broad applicability across various scenarios.

Problem

Research questions and friction points this paper is trying to address.

Predicting high-dimensional dynamics with limited noisy data

Overcoming nonlinearity and complex variable interactions

Improving accuracy in spatiotemporal forecasting tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multivariate spatiotemporal information transformation for dynamics

Shared ViT encoder for cross-representing delay-embedded states

Linear decoders for parallel prediction of all variables

🔎 Similar Papers

DLFormer: Enhancing Explainability in Multivariate Time Series Forecasting using Distributed Lag Embedding