🤖 AI Summary
Weakly supervised lesion segmentation in medical imaging is hindered by the scarcity of pixel-level annotations. To address this, we propose a two-stage frequency-domain enhanced Mamba framework. In Stage I, a learnable frequency-domain encoder collaborates with a Mamba-based encoder to generate high-quality class activation maps (CAMs), pioneering the integration of spectral modeling into multi-instance learning (MIL). In Stage II, soft pseudo-labels are generated under CAM guidance, augmented by a self-correction mechanism to suppress label noise. By embedding frequency-domain priors, our method enhances spatial-structural awareness without requiring explicit pixel-wise supervision. Evaluated on multiple public and private histopathological datasets, it significantly outperforms existing weakly supervised state-of-the-art methods, achieving substantial gains in segmentation accuracy and robustness. This work validates the efficacy of frequency-driven sequential modeling for weakly supervised medical image analysis.
📝 Abstract
Accurate lesion segmentation in histopathology images is essential for diagnostic interpretation and quantitative analysis, yet it remains challenging due to the limited availability of costly pixel-level annotations. To address this, we propose FMaMIL, a novel two-stage framework for weakly supervised lesion segmentation based solely on image-level labels. In the first stage, a lightweight Mamba-based encoder is introduced to capture long-range dependencies across image patches under the MIL paradigm. To enhance spatial sensitivity and structural awareness, we design a learnable frequency-domain encoding module that supplements spatial-domain features with spectrum-based information. CAMs generated in this stage are used to guide segmentation training. In the second stage, we refine the initial pseudo labels via a CAM-guided soft-label supervision and a self-correction mechanism, enabling robust training even under label noise. Extensive experiments on both public and private histopathology datasets demonstrate that FMaMIL outperforms state-of-the-art weakly supervised methods without relying on pixel-level annotations, validating its effectiveness and potential for digital pathology applications.