🤖 AI Summary
Existing Mamba models exhibit limited capability in modeling nonlinear dependencies and suffer from restricted receptive fields due to convolutional operations in time-series modeling. To address these limitations, this paper proposes Attention Mamba—a novel framework integrating state-space models with attention mechanisms. It introduces an adaptive pooling attention mechanism that incorporates global contextual information while reducing computational complexity; additionally, it designs a bidirectional Mamba block enabling end-to-end mapping from input to value representations, thereby overcoming local receptive field constraints. By deeply fusing state-space dynamics with attention-based global interaction, the framework significantly enhances long-range dependency capture and nonlinear modeling capacity. Extensive experiments across diverse time-series forecasting benchmarks demonstrate that Attention Mamba consistently outperforms Transformer, Informer, and standard Mamba variants, validating its effectiveness, robustness, and generalizability.
📝 Abstract
"This work has been submitted to the lEEE for possible publication. Copyright may be transferred without noticeafter which this version may no longer be accessible."Time series modeling serves as the cornerstone of real-world applications, such as weather forecasting and transportation management. Recently, Mamba has become a promising model that combines near-linear computational complexity with high prediction accuracy in time series modeling, while facing challenges such as insufficient modeling of nonlinear dependencies in attention and restricted receptive fields caused by convolutions. To overcome these limitations, this paper introduces an innovative framework, Attention Mamba, featuring a novel Adaptive Pooling block that accelerates attention computation and incorporates global information, effectively overcoming the constraints of limited receptive fields. Furthermore, Attention Mamba integrates a bidirectional Mamba block, efficiently capturing long-short features and transforming inputs into the Value representations for attention mechanisms. Extensive experiments conducted on diverse datasets underscore the effectiveness of Attention Mamba in extracting nonlinear dependencies and enhancing receptive fields, establishing superior performance among leading counterparts. Our codes will be available on GitHub.