π€ AI Summary
Medical image vessel segmentation often suffers from structural discontinuities due to the slender, tortuous morphology of vessels and insufficient prior modeling. To address this, we propose SwinMambaβa novel hybrid architecture integrating state-space models with vision Transformers. Specifically, we design a Serpentine-Window Tokenizer (SWToken) for adaptive feature sampling and flexible receptive field control; introduce a Bidirectional Aggregation Module (BAM) to enhance local feature fusion; and construct a Spatial-Frequency Fusion Unit (SFFU) to improve continuity representation. By leveraging serpentine-window sequences, our method enables effective global context modeling, significantly strengthening connectivity modeling for thin, elongated vessels. Evaluated on three mainstream medical vessel datasets, SwinMamba achieves new state-of-the-art performance: 18.3% reduction in Hausdorff distance (improved completeness), 4.2% gain in F1-score (enhanced connectivity), and superior overall segmentation accuracy.
π Abstract
Vascular segmentation in medical images is crucial for disease diagnosis and surgical navigation. However, the segmented vascular structure is often discontinuous due to its slender nature and inadequate prior modeling. In this paper, we propose a novel Serpentine Window Mamba (SWinMamba) to achieve accurate vascular segmentation. The proposed SWinMamba innovatively models the continuity of slender vascular structures by incorporating serpentine window sequences into bidirectional state space models. The serpentine window sequences enable efficient feature capturing by adaptively guiding global visual context modeling to the vascular structure. Specifically, the Serpentine Window Tokenizer (SWToken) adaptively splits the input image using overlapping serpentine window sequences, enabling flexible receptive fields (RFs) for vascular structure modeling. The Bidirectional Aggregation Module (BAM) integrates coherent local features in the RFs for vascular continuity representation. In addition, dual-domain learning with Spatial-Frequency Fusion Unit (SFFU) is designed to enhance the feature representation of vascular structure. Extensive experiments on three challenging datasets demonstrate that the proposed SWinMamba achieves superior performance with complete and connected vessels.