🤖 AI Summary
Pathological image cell detection faces challenges including high cell density, subtle inter-class distinctions, and severe background interference. To address these, we propose a lightweight, efficient one-stage fine-grained detector. Our method introduces the novel Triple-Mapping Adaptive Coupling (TMAC) module, which adaptively fuses local sensitivity with global consistency; an Adaptive Mamba Head for dynamic multi-scale feature weighting; and a VSSD backbone integrating NC-Mamba/MSA with TMAC, enhanced by a learnable multi-scale feature fusion mechanism. Evaluated on CoNSeP and CytoDArk0, our approach surpasses CNN-, Transformer-, and Mamba-based baselines in detection accuracy while reducing model parameters by 32% and inference latency by 41%.
📝 Abstract
Cell detection in pathological images presents unique challenges due to densely packed objects, subtle inter-class differences, and severe background clutter. In this paper, we propose CellMamba, a lightweight and accurate one-stage detector tailored for fine-grained biomedical instance detection. Built upon a VSSD backbone, CellMamba integrates CellMamba Blocks, which couple either NC-Mamba or Multi-Head Self-Attention (MSA) with a novel Triple-Mapping Adaptive Coupling (TMAC) module. TMAC enhances spatial discriminability by splitting channels into two parallel branches, equipped with dual idiosyncratic and one consensus attention map, adaptively fused to preserve local sensitivity and global consistency. Furthermore, we design an Adaptive Mamba Head that fuses multi-scale features via learnable weights for robust detection under varying object sizes. Extensive experiments on two public datasets-CoNSeP and CytoDArk0-demonstrate that CellMamba outperforms both CNN-based, Transformer-based, and Mamba-based baselines in accuracy, while significantly reducing model size and inference latency. Our results validate CellMamba as an efficient and effective solution for high-resolution cell detection.