Delayformer: spatiotemporal transformation for predicting high-dimensional dynamics

📅 2025-06-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing multivariate time-series forecasting for high-dimensional nonlinear dynamical systems under small-sample and noisy conditions, this paper proposes Delayformer. Methodologically, it pioneers the integration of delay embedding theory with vision Transformers, introducing a multivariate spatiotemporal information (mvSTI) transformation that maps delayed sequences of individual variables into joint state vectors—enabling system-level, rather than univariate, modeling to mitigate nonlinearity and inter-variable coupling. The architecture employs a shared ViT encoder, multi-head self-attention, and variable-specific linear decoders. Extensive experiments demonstrate that Delayformer significantly outperforms state-of-the-art methods on both synthetic and real-world benchmarks. Moreover, it exhibits strong cross-domain generalization across meteorological, traffic, and financial forecasting tasks, validating its potential as a foundational time-series modeling framework.

Technology Category

Application Category

📝 Abstract
Predicting time-series is of great importance in various scientific and engineering fields. However, in the context of limited and noisy data, accurately predicting dynamics of all variables in a high-dimensional system is a challenging task due to their nonlinearity and also complex interactions. Current methods including deep learning approaches often perform poorly for real-world systems under such circumstances. This study introduces the Delayformer framework for simultaneously predicting dynamics of all variables, by developing a novel multivariate spatiotemporal information (mvSTI) transformation that makes each observed variable into a delay-embedded state (vector) and further cross-learns those states from different variables. From dynamical systems viewpoint, Delayformer predicts system states rather than individual variables, thus theoretically and computationally overcoming such nonlinearity and cross-interaction problems. Specifically, it first utilizes a single shared Visual Transformer (ViT) encoder to cross-represent dynamical states from observed variables in a delay embedded form and then employs distinct linear decoders for predicting next states, i.e. equivalently predicting all original variables parallelly. By leveraging the theoretical foundations of delay embedding theory and the representational capabilities of Transformers, Delayformer outperforms current state-of-the-art methods in forecasting tasks on both synthetic and real-world datasets. Furthermore, the potential of Delayformer as a foundational time-series model is demonstrated through cross-domain forecasting tasks, highlighting its broad applicability across various scenarios.
Problem

Research questions and friction points this paper is trying to address.

Predicting high-dimensional dynamics with limited noisy data
Overcoming nonlinearity and complex variable interactions
Improving accuracy in spatiotemporal forecasting tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multivariate spatiotemporal information transformation for dynamics
Shared ViT encoder for cross-representing delay-embedded states
Linear decoders for parallel prediction of all variables
🔎 Similar Papers
No similar papers found.
Z
Zijian Wang
Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
P
Peng Tao
Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
Luonan Chen
Luonan Chen
Chair Professor, School of Mathematical Sciences and School of AI, Shanghai Jiao Tong University
Systems BiologyBioinformaticsNonlinear DynamicsAI