🤖 AI Summary
To address the poor domain adaptability and weak generalization of the Segment Anything Model (SAM) in generic medical image segmentation, this paper proposes SAM-TTA—a fine-tuning-free, pure test-time adaptive framework. Methodologically, it introduces (1) a Self-adaptive Bézier Curve Transformation (SBCT) for robust input-space augmentation, and (2) a Dual-scale Uncertainty-guided Mean Teacher (DUMT) model to jointly enhance prediction consistency and confidence calibration. By integrating test-time consistency learning with zero-shot domain adaptation, SAM-TTA achieves state-of-the-art performance across five public medical imaging benchmarks—outperforming all existing test-time adaptation (TTA) methods. Notably, in several scenarios, it even surpasses fully fine-tuned MedSAM. This work marks the first demonstration of plug-and-play, high-accuracy, and strongly generalizable segmentation for diverse medical images without any parameter updates or labeled target-domain data.
📝 Abstract
Universal medical image segmentation using the Segment Anything Model (SAM) remains challenging due to its limited adaptability to medical domains. Existing adaptations, such as MedSAM, enhance SAM's performance in medical imaging but at the cost of reduced generalization to unseen data. Therefore, in this paper, we propose SAM-aware Test-Time Adaptation (SAM-TTA), a fundamentally different pipeline that preserves the generalization of SAM while improving its segmentation performance in medical imaging via a test-time framework. SAM-TTA tackles two key challenges: (1) input-level discrepancies caused by differences in image acquisition between natural and medical images and (2) semantic-level discrepancies due to fundamental differences in object definition between natural and medical domains (e.g., clear boundaries vs. ambiguous structures). Specifically, our SAM-TTA framework comprises (1) Self-adaptive Bezier Curve-based Transformation (SBCT), which adaptively converts single-channel medical images into three-channel SAM-compatible inputs while maintaining structural integrity, to mitigate the input gap between medical and natural images, and (2) Dual-scale Uncertainty-driven Mean Teacher adaptation (DUMT), which employs consistency learning to align SAM's internal representations to medical semantics, enabling efficient adaptation without auxiliary supervision or expensive retraining. Extensive experiments on five public datasets demonstrate that our SAM-TTA outperforms existing TTA approaches and even surpasses fully fine-tuned models such as MedSAM in certain scenarios, establishing a new paradigm for universal medical image segmentation. Code can be found at https://github.com/JianghaoWu/SAM-TTA.