Multi-Sequence Parotid Gland Lesion Segmentation via Expert Text-Guided Segment Anything Model

📅 2025-08-13

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Parotid lesion segmentation faces challenges including highly variable lesion morphology, ill-defined boundaries, and difficulty in obtaining precise prompts; moreover, existing methods inadequately incorporate clinical expert knowledge. To address these issues, we propose Text-SAM, the first text-guided adaptation of the Segment Anything Model (SAM) for medical imaging. Text-SAM converts unstructured clinical diagnostic reports into structured textual prompts to inject domain-specific prior knowledge. We design a cross-sequence attention mechanism to jointly model multimodal imaging data (e.g., T1-, T2-, and contrast-enhanced sequences) and textual prompts. Segmentation is performed end-to-end using SAM’s decoder. Evaluated on independent datasets from three clinical centers, Text-SAM significantly outperforms state-of-the-art methods (p < 0.01), achieving Dice score improvements of 3.2–5.8%. These results demonstrate the efficacy and generalizability of text-guided segmentation across multi-center, multi-sequence medical image analysis.

Technology Category

Application Category

📝 Abstract

Parotid gland lesion segmentation is essential for the treatment of parotid gland diseases. However, due to the variable size and complex lesion boundaries, accurate parotid gland lesion segmentation remains challenging. Recently, the Segment Anything Model (SAM) fine-tuning has shown remarkable performance in the field of medical image segmentation. Nevertheless, SAM's interaction segmentation model relies heavily on precise lesion prompts (points, boxes, masks, etc.), which are very difficult to obtain in real-world applications. Besides, current medical image segmentation methods are automatically generated, ignoring the domain knowledge of medical experts when performing segmentation. To address these limitations, we propose the parotid gland segment anything model (PG-SAM), an expert diagnosis text-guided SAM incorporating expert domain knowledge for cross-sequence parotid gland lesion segmentation. Specifically, we first propose an expert diagnosis report guided prompt generation module that can automatically generate prompt information containing the prior domain knowledge to guide the subsequent lesion segmentation process. Then, we introduce a cross-sequence attention module, which integrates the complementary information of different modalities to enhance the segmentation effect. Finally, the multi-sequence image features and generated prompts are feed into the decoder to get segmentation result. Experimental results demonstrate that PG-SAM achieves state-of-the-art performance in parotid gland lesion segmentation across three independent clinical centers, validating its clinical applicability and the effectiveness of diagnostic text for enhancing image segmentation in real-world clinical settings.

Problem

Research questions and friction points this paper is trying to address.

Accurate parotid gland lesion segmentation is challenging due to variable size and complex boundaries

Current methods lack expert domain knowledge and rely on difficult-to-obtain precise lesion prompts

Proposed PG-SAM integrates expert text guidance and cross-sequence attention for improved segmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Expert text-guided SAM for lesion segmentation

Cross-sequence attention for multi-modal integration

Automatic prompt generation with domain knowledge

🔎 Similar Papers

No similar papers found.