🤖 AI Summary
To address the challenges of jointly modeling cross-variable and temporal dependencies, and capturing dynamic causal time lags in high-dimensional multivariate time series forecasting, this paper proposes a global block-compression-guided cross-block attention mechanism. Operating within a compressed representation space, it enables efficient joint modeling of inter-variable and intra-temporal dependencies for the first time, reducing computational complexity to $O(D^2 cdot ext{Patch_num} cdot d_{ ext{model}})$. Built upon the Transformer architecture, the method integrates global patch compression, cross-patch self-attention, channel-aware temporal modeling, and lightweight parameterization. Extensive experiments on nine mainstream real-world datasets demonstrate consistent and significant improvements over state-of-the-art methods including iTransformer and PatchTST. The implementation is publicly available.
📝 Abstract
Among the existing Transformer-based multivariate time series forecasting methods, iTransformer, which treats each variable sequence as a token and only explicitly extracts cross-variable dependencies, and PatchTST, which adopts a channel-independent strategy and only explicitly extracts cross-time dependencies, both significantly outperform most Channel-Dependent Transformer that simultaneously extract cross-time and cross-variable dependencies. This indicates that existing Transformer-based multivariate time series forecasting methods still struggle to effectively fuse these two types of information. We attribute this issue to the dynamic time lags in the causal relationships between different variables. Therefore, we propose a new multivariate time series forecasting Transformer, Sensorformer, which first compresses the global patch information and then simultaneously extracts cross-variable and cross-time dependencies from the compressed representations. Sensorformer can effectively capture the correct inter-variable correlations and causal relationships, even in the presence of dynamic causal lags between variables, while also reducing the computational complexity of pure cross-patch self-attention from $O(D^2 cdot Patch_num^2 cdot d_model)$ to $O(D^2 cdot Patch_num cdot d_model)$. Extensive comparative and ablation experiments on 9 mainstream real-world multivariate time series forecasting datasets demonstrate the superiority of Sensorformer. The implementation of Sensorformer, following the style of the Time-series-library and scripts for reproducing the main results, is publicly available at https://github.com/BigYellowTiger/Sensorformer