🤖 AI Summary
This study addresses the challenge of inconsistent medical image quality arising from variations in imaging devices and acquisition protocols by proposing a training-free test-time augmentation framework to enhance zero-shot segmentation performance. The method introduces, for the first time in zero-shot medical image segmentation, a multi-augmentation strategy—combining gamma correction, contrast enhancement, Gaussian blur, and Gaussian noise—together with a weighted voting mechanism across multiple MedSAM2 checkpoints. It further enables dynamic adjustment of augmentation combinations and voting thresholds based on organ scale and clinical requirements. Extensive evaluation on three medical imaging datasets demonstrates significant improvements in segmentation accuracy, achieving a 1.6-point increase in mean Intersection over Union (mIoU), a 1.9-point gain in average IoU (aIoU), and a reduction of approximately 2.0 in Hausdorff Distance at 95% (HD95) on a multi-class hepatic vessel dataset.
📝 Abstract
Increasingly advanced data augmentation techniques have greatly aided clinical medical research, increasing data diversity and improving model generalization capabilities. Although most current basic models exhibit strong generalization abilities, image quality varies due to differences in equipment and operators. To address these challenges, we present SegTTA, a framework that improves medical image segmentation without model retraining by combining four augmentations (Gamma correction, Contrast enhancement, Gaussian blur, Gaussian noise) with weighted voting across multiple MedSAM2 checkpoints. Experiments demonstrate consistent improvements across three diverse datasets: healthy uterus segmentation, uterine myoma detection, and multi class hepatic structure segmentation. Ablation studies reveal that large organs benefit from intensity augmentations while small lesions require noise augmentations. The voting threshold controls the coverage precision trade off, enabling task specific optimization for different clinical requirements. Ultimately, on a multiclass hepatic vessel dataset, compared to MedSAM2 baselines, our method achieves an increase of 1.6 in mIoU and 1.9 in aIoU, along with a reduction of approximately 2.0 in HD95. Code will be available at https://github.com/AIGeeksGroup/SegTTA.