ASP-VMUNet: Atrous Shifted Parallel Vision Mamba U-Net for Skin Lesion Segmentation

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Skin lesion segmentation faces dual challenges: limited receptive fields in CNNs and high computational overhead in Transformers. To address these, we propose DermMamba—a lightweight, efficient U-Net architecture integrating three key innovations: (1) dilated shift scanning (a novel operation enabling sparse yet extensive spatial coverage), (2) parallel Vision Mamba modules for long-range dependency modeling, and (3) a selective kernel CNN branch for adaptive multi-scale feature extraction. A Shift Round operation further enhances cross-regional feature interaction, facilitating robust multi-scale contextual aggregation. DermMamba achieves state-of-the-art performance on four major benchmarks—ISIC2016, ISIC2017, ISIC2018, and PH2—while maintaining low parameter count (<15M) and high inference efficiency (>35 FPS on RTX 3090). Ablation studies confirm the efficacy of each component. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Skin lesion segmentation is a critical challenge in computer vision, and it is essential to separate pathological features from healthy skin for diagnostics accurately. Traditional Convolutional Neural Networks (CNNs) are limited by narrow receptive fields, and Transformers face significant computational burdens. This paper presents a novel skin lesion segmentation framework, the Atrous Shifted Parallel Vision Mamba UNet (ASP-VMUNet), which integrates the efficient and scalable Mamba architecture to overcome limitations in traditional CNNs and computationally demanding Transformers. The framework introduces an atrous scan technique that minimizes background interference and expands the receptive field, enhancing Mamba's scanning capabilities. Additionally, the inclusion of a Parallel Vision Mamba (PVM) layer and a shift round operation optimizes feature segmentation and fosters rich inter-segment information exchange. A supplementary CNN branch with a Selective-Kernel (SK) Block further refines the segmentation by blending local and global contextual information. Tested on four benchmark datasets (ISIC16/17/18 and PH2), ASP-VMUNet demonstrates superior performance in skin lesion segmentation, validated by comprehensive ablation studies. This approach not only advances medical image segmentation but also highlights the benefits of hybrid architectures in medical imaging technology. Our code is available at https://github.com/BaoBao0926/ASP-VMUNet/tree/main.
Problem

Research questions and friction points this paper is trying to address.

Overcoming CNN and Transformer limitations in skin lesion segmentation
Enhancing Mamba's scanning with atrous technique for better segmentation
Improving segmentation accuracy via hybrid architecture and SK Block
Innovation

Methods, ideas, or system contributions that make the work stand out.

Atrous scan technique minimizes background interference
Parallel Vision Mamba layer optimizes feature segmentation
CNN branch with SK Block blends contextual information
🔎 Similar Papers
No similar papers found.
Muyi Bao
Muyi Bao
Carneige Mellon University
S
Shuchang Lyu
School of Electronics and Information Engineering, Beihang University
Z
Zhaoyang Xu
Department of Paediatrics, Cambridge University
Q
Qi Zhao
School of Electronics and Information Engineering, Beihang University
Changyu Zeng
Changyu Zeng
XJTLU
self-supervised learningpoint cloudcomputer vision
W
Wenpei Bai
Department of Gynecology and Obstetrics, Beijing Shijitan Hospital, Capital Medical University
Guangliang Cheng
Guangliang Cheng
Reader (Associate Professor) in University of Liverpool
Computer VisionDeepfake DetectionAutonomous DrivingRobotics