๐ค AI Summary
Pancreatic EUS images suffer from severe speckle noise, low contrast, and non-intuitive anatomical structures, limiting the performance of fully supervised segmentation models and imposing heavy reliance on large-scale expert annotations. To address this, we propose the first text-prompted learning framework integrated into the Segment Anything Model (SAM) for lightweight, geometry-free automatic segmentation. Specifically, we employ the BiomedCLIP text encoder to model natural-language prompts and apply Low-Rank Adaptation (LoRA) for efficient fine-tuning of SAMโadjusting only 0.86% of its parameters. This strategy drastically reduces annotation dependency while preserving strong generalization. Evaluated on a public pancreatic EUS dataset, our method achieves 82.69% Dice score and 85.28% normalized surface distance under fully automatic prompting, surpassing both state-of-the-art supervised models and baseline foundation models. The approach establishes a novel paradigm for low-resource medical image segmentation.
๐ Abstract
Pancreatic cancer carries a poor prognosis and relies on endoscopic ultrasound (EUS) for targeted biopsy and radiotherapy. However, the speckle noise, low contrast, and unintuitive appearance of EUS make segmentation of pancreatic tumors with fully supervised deep learning (DL) models both error-prone and dependent on large, expert-curated annotation datasets. To address these challenges, we present TextSAM-EUS, a novel, lightweight, text-driven adaptation of the Segment Anything Model (SAM) that requires no manual geometric prompts at inference. Our approach leverages text prompt learning (context optimization) through the BiomedCLIP text encoder in conjunction with a LoRA-based adaptation of SAM's architecture to enable automatic pancreatic tumor segmentation in EUS, tuning only 0.86% of the total parameters. On the public Endoscopic Ultrasound Database of the Pancreas, TextSAM-EUS with automatic prompts attains 82.69% Dice and 85.28% normalized surface distance (NSD), and with manual geometric prompts reaches 83.10% Dice and 85.70% NSD, outperforming both existing state-of-the-art (SOTA) supervised DL models and foundation models (e.g., SAM and its variants). As the first attempt to incorporate prompt learning in SAM-based medical image segmentation, TextSAM-EUS offers a practical option for efficient and robust automatic EUS segmentation. Our code will be publicly available upon acceptance.