🤖 AI Summary
Skin lesion segmentation faces challenges including complex anatomical structures, ambiguous boundaries, and large inter-lesion scale variations. To address these, this paper proposes an enhanced dual-path TransUNet architecture that synergistically integrates Transformer and CNN modules: the former captures long-range semantic dependencies, while the latter preserves fine-grained local texture details. A boundary-guided attention mechanism is introduced to enhance edge localization accuracy, and a multi-scale upsampling pathway is designed to improve structural consistency in lesion reconstruction. The framework further incorporates multi-resolution input, lesion-aware data augmentation, and systematic hyperparameter optimization. Evaluated on public benchmark datasets, the proposed method achieves state-of-the-art performance—significantly surpassing existing approaches in mean Intersection-over-Union (mIoU), mean Dice coefficient (mDice), and mean pixel accuracy (mAcc)—demonstrating superior segmentation precision and robustness. This advancement provides a more reliable foundation for intelligent skin cancer diagnosis.
📝 Abstract
This paper proposes a high-precision semantic segmentation method based on an improved TransUNet architecture to address the challenges of complex lesion structures, blurred boundaries, and significant scale variations in skin lesion images. The method integrates a transformer module into the traditional encoder-decoder framework to model global semantic information, while retaining a convolutional branch to preserve local texture and edge features. This enhances the model's ability to perceive fine-grained structures. A boundary-guided attention mechanism and multi-scale upsampling path are also designed to improve lesion boundary localization and segmentation consistency. To verify the effectiveness of the approach, a series of experiments were conducted, including comparative studies, hyperparameter sensitivity analysis, data augmentation effects, input resolution variation, and training data split ratio tests. Experimental results show that the proposed model outperforms existing representative methods in mIoU, mDice, and mAcc, demonstrating stronger lesion recognition accuracy and robustness. In particular, the model achieves better boundary reconstruction and structural recovery in complex scenarios, making it well-suited for the key demands of automated segmentation tasks in skin lesion analysis.