🤖 AI Summary
This work addresses the limited generalization of existing medical image lesion segmentation methods across anatomical sites and imaging modalities. It presents the first systematic investigation of the Segment Anything Model 3 (SAM3) for lesion segmentation in multimodal medical imaging, leveraging flexible guidance through geometric box, textual, and image-concept prompts. To enhance robustness, the approach integrates inter-slice predictions and multi-parameter priors. Extensive experiments across 13 datasets encompassing 11 distinct lesion types demonstrate that SAM3 achieves strong cross-modal generalization, precise boundary delineation, and reliable semantics-driven segmentation performance without requiring extensive fine-tuning, thereby validating its practical potential in clinical settings.
📝 Abstract
Accurate lesion segmentation is essential in medical image analysis, yet most existing methods are designed for specific anatomical sites or imaging modalities, limiting their generalizability. Recent vision-language foundation models enable concept-driven segmentation in natural images, offering a promising direction for more flexible medical image analysis. However, concept-prompt-based lesion segmentation, particularly with the latest Segment Anything Model 3 (SAM3), remains underexplored.
In this work, we present a systematic evaluation of SAM3 for lesion segmentation. We assess its performance using geometric bounding boxes and concept-based text and image prompts across multiple modalities, including multiparametric MRI, CT, ultrasound, dermoscopy, and endoscopy. To improve robustness, we incorporate additional prior knowledge, such as adjacent-slice predictions, multiparametric information, and prior annotations. We further compare different fine-tuning strategies, including partial module tuning, adapter-based methods, and full-model optimization.
Experiments on 13 datasets covering 11 lesion types demonstrate that SAM3 achieves strong cross-modality generalization, reliable concept-driven segmentation, and accurate lesion delineation. These results highlight the potential of concept-based foundation models for scalable and practical medical image segmentation. Code and trained models will be released at: https://github.com/apple1986/lesion-sam3