🤖 AI Summary
Skin lesion segmentation faces dual challenges: limited receptive fields in CNNs and high computational overhead in Transformers. To address these, we propose DermMamba—a lightweight, efficient U-Net architecture integrating three key innovations: (1) dilated shift scanning (a novel operation enabling sparse yet extensive spatial coverage), (2) parallel Vision Mamba modules for long-range dependency modeling, and (3) a selective kernel CNN branch for adaptive multi-scale feature extraction. A Shift Round operation further enhances cross-regional feature interaction, facilitating robust multi-scale contextual aggregation. DermMamba achieves state-of-the-art performance on four major benchmarks—ISIC2016, ISIC2017, ISIC2018, and PH2—while maintaining low parameter count (<15M) and high inference efficiency (>35 FPS on RTX 3090). Ablation studies confirm the efficacy of each component. The source code is publicly available.
📝 Abstract
Skin lesion segmentation is a critical challenge in computer vision, and it is essential to separate pathological features from healthy skin for diagnostics accurately. Traditional Convolutional Neural Networks (CNNs) are limited by narrow receptive fields, and Transformers face significant computational burdens. This paper presents a novel skin lesion segmentation framework, the Atrous Shifted Parallel Vision Mamba UNet (ASP-VMUNet), which integrates the efficient and scalable Mamba architecture to overcome limitations in traditional CNNs and computationally demanding Transformers. The framework introduces an atrous scan technique that minimizes background interference and expands the receptive field, enhancing Mamba's scanning capabilities. Additionally, the inclusion of a Parallel Vision Mamba (PVM) layer and a shift round operation optimizes feature segmentation and fosters rich inter-segment information exchange. A supplementary CNN branch with a Selective-Kernel (SK) Block further refines the segmentation by blending local and global contextual information. Tested on four benchmark datasets (ISIC16/17/18 and PH2), ASP-VMUNet demonstrates superior performance in skin lesion segmentation, validated by comprehensive ablation studies. This approach not only advances medical image segmentation but also highlights the benefits of hybrid architectures in medical imaging technology. Our code is available at https://github.com/BaoBao0926/ASP-VMUNet/tree/main.