🤖 AI Summary
Existing Transformer-based models for multivariate time series forecasting typically capture either temporal or channel-wise dependencies in isolation, failing to jointly model both and thereby limiting predictive performance. To address this, we propose the Dual-Path Transformer (DPT), which introduces a novel multi-patch attention mechanism—replacing standard multi-head attention—and a dual-path architecture: one path employs causal temporal attention to model dynamic time dependencies, while the other applies channel attention to capture cross-variable correlations. DPT adopts an end-to-end Transformer design, integrating multi-patch input encoding with a causal decoder. Evaluated on multiple standard benchmarks, DPT achieves state-of-the-art or top-tier performance, significantly improving both accuracy and robustness in long-horizon multivariate forecasting.
📝 Abstract
Transformer-based time series forecasting has recently gained strong interest due to the ability of transformers to model sequential data. Most of the state-of-the-art architectures exploit either temporal or inter-channel dependencies, limiting their effectiveness in multivariate time-series forecasting where both types of dependencies are crucial. We propose Sentinel, a full transformer-based architecture composed of an encoder able to extract contextual information from the channel dimension, and a decoder designed to capture causal relations and dependencies across the temporal dimension. Additionally, we introduce a multi-patch attention mechanism, which leverages the patching process to structure the input sequence in a way that can be naturally integrated into the transformer architecture, replacing the multi-head splitting process. Extensive experiments on standard benchmarks demonstrate that Sentinel, because of its ability to"monitor"both the temporal and the inter-channel dimension, achieves better or comparable performance with respect to state-of-the-art approaches.