๐ค AI Summary
Existing multivariate time series forecasting methods suffer from modeling bias due to uniform weighting of short- and long-term dependencies. To address this, we propose ParallelTimeโa novel architecture featuring an input-aware ParallelTime Weighter module that dynamically allocates adaptive weights to local attention (for short-term patterns) and Mamba-based state-space modeling (for long-term dynamics), thereby overcoming the limitations of conventional equal-weighted fusion. The architecture adopts a parallel design to jointly optimize computational efficiency and modeling flexibility. Extensive experiments on multiple benchmark datasets demonstrate state-of-the-art (SOTA) performance, with significant reductions in both FLOPs and parameter count. Moreover, ParallelTime exhibits superior long-horizon extrapolation capability, confirming its effectiveness in capturing extended temporal dependencies.
๐ Abstract
Modern multivariate time series forecasting primarily relies on two architectures: the Transformer with attention mechanism and Mamba. In natural language processing, an approach has been used that combines local window attention for capturing short-term dependencies and Mamba for capturing long-term dependencies, with their outputs averaged to assign equal weight to both. We find that for time-series forecasting tasks, assigning equal weight to long-term and short-term dependencies is not optimal. To mitigate this, we propose a dynamic weighting mechanism, ParallelTime Weighter, which calculates interdependent weights for long-term and short-term dependencies for each token based on the input and the model's knowledge. Furthermore, we introduce the ParallelTime architecture, which incorporates the ParallelTime Weighter mechanism to deliver state-of-the-art performance across diverse benchmarks. Our architecture demonstrates robustness, achieves lower FLOPs, requires fewer parameters, scales effectively to longer prediction horizons, and significantly outperforms existing methods. These advances highlight a promising path for future developments of parallel Attention-Mamba in time series forecasting. The implementation is readily available at: href{https://github.com/itay1551/ParallelTime}{ParallelTime GitHub