🤖 AI Summary
To address the quadratic computational complexity of Transformer-based models with respect to the number of image patches in multi-view mammography classification, this paper proposes SecMamba—a novel architecture integrating selective state space models (SSMs), lightweight attention, and a sequential mixture-of-experts (SeqMoE) mechanism. SecMamba employs content-adaptive feature enhancement via dedicated SecMamba modules and jointly models local details and global context in a staged manner. SeqMoE enables dynamic expert routing to substantially reduce redundant computation. Evaluated on the CBIS-DDSM dataset, SecMamba achieves state-of-the-art performance—outperforming existing methods in accuracy and AUC—while improving inference speed by 37% and reducing FLOPs by 42%. This work thus establishes a new trade-off frontier between high accuracy and computational efficiency for multi-view mammographic analysis.
📝 Abstract
Breast cancer (BC) remains one of the leading causes of cancer-related mortality among women, despite recent advances in Computer-Aided Diagnosis (CAD) systems. Accurate and efficient interpretation of multi-view mammograms is essential for early detection, driving a surge of interest in Artificial Intelligence (AI)-powered CAD models. While state-of-the-art multi-view mammogram classification models are largely based on Transformer architectures, their computational complexity scales quadratically with the number of image patches, highlighting the need for more efficient alternatives. To address this challenge, we propose Mammo-Mamba, a novel framework that integrates Selective State-Space Models (SSMs), transformer-based attention, and expert-driven feature refinement into a unified architecture. Mammo-Mamba extends the MambaVision backbone by introducing the Sequential Mixture of Experts (SeqMoE) mechanism through its customized SecMamba block. The SecMamba is a modified MambaVision block that enhances representation learning in high-resolution mammographic images by enabling content-adaptive feature refinement. These blocks are integrated into the deeper stages of MambaVision, allowing the model to progressively adjust feature emphasis through dynamic expert gating, effectively mitigating the limitations of traditional Transformer models. Evaluated on the CBIS-DDSM benchmark dataset, Mammo-Mamba achieves superior classification performance across all key metrics while maintaining computational efficiency.