PG-SAM: Prior-Guided SAM with Medical for Multi-organ Segmentation

📅 2025-03-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the significant degradation in zero-shot performance of Segment Anything Model (SAM) for multi-organ segmentation in medical images—attributed to coarse-grained textual prompts, substantial domain shift, and modality mismatch between high-level semantics and pixel-level boundaries—this paper proposes MedSAM. First, it leverages a medical large language model to generate anatomy-aware, fine-grained textual priors, enabling precise cross-modal alignment. Second, it introduces a multi-level feature fusion decoder and an iterative mask optimizer to support prompt-free learning. Third, it establishes a unified semantic injection pipeline to enhance prior quality and boundary fidelity. Evaluated on the Synapse dataset, MedSAM achieves state-of-the-art performance, with substantial improvements in organ boundary accuracy and enhanced cross-domain generalization. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Segment Anything Model (SAM) demonstrates powerful zero-shot capabilities; however, its accuracy and robustness significantly decrease when applied to medical image segmentation. Existing methods address this issue through modality fusion, integrating textual and image information to provide more detailed priors. In this study, we argue that the granularity of text and the domain gap affect the accuracy of the priors. Furthermore, the discrepancy between high-level abstract semantics and pixel-level boundary details in images can introduce noise into the fusion process. To address this, we propose Prior-Guided SAM (PG-SAM), which employs a fine-grained modality prior aligner to leverage specialized medical knowledge for better modality alignment. The core of our method lies in efficiently addressing the domain gap with fine-grained text from a medical LLM. Meanwhile, it also enhances the priors' quality after modality alignment, ensuring more accurate segmentation. In addition, our decoder enhances the model's expressive capabilities through multi-level feature fusion and iterative mask optimizer operations, supporting unprompted learning. We also propose a unified pipeline that effectively supplies high-quality semantic information to SAM. Extensive experiments on the Synapse dataset demonstrate that the proposed PG-SAM achieves state-of-the-art performance. Our anonymous code is released at https://github.com/logan-0623/PG-SAM.
Problem

Research questions and friction points this paper is trying to address.

Improves medical image segmentation accuracy with fine-grained priors
Reduces domain gap using medical LLM text alignment
Enhances segmentation via multi-level feature fusion and optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained modality prior aligner for medical images
Multi-level feature fusion and iterative mask optimizer
Unified pipeline supplying high-quality semantic information
🔎 Similar Papers
No similar papers found.
Y
Yiheng Zhong
Mohamed bin Zayed University of AI, Abu Dhabi, UAE; Xi’an Jiaotong-Liverpool University, Suzhou, China; University of Liverpool, Liverpool, United Kingdom
Z
Zihong Luo
Mohamed bin Zayed University of AI, Abu Dhabi, UAE; Xi’an Jiaotong-Liverpool University, Suzhou, China; University of Liverpool, Liverpool, United Kingdom
Chengzhi Liu
Chengzhi Liu
PhD, UC Santa Barbara
Vison Language ModelTruthworthy AIReasoning
F
Feilong Tang
Mohamed bin Zayed University of AI, Abu Dhabi, UAE; Monash University, Melbourne, Australia
Zelin Peng
Zelin Peng
Shanghai Jiao Tong University
Computer VisionMedical Image Processing
M
Ming Hu
Monash University, Melbourne, Australia
Y
Yingzhen Hu
Xi’an Jiaotong-Liverpool University, Suzhou, China
Jionglong Su
Jionglong Su
Xi'an Jiaotong-Liverpool University
AI Big Data Machine Learning Statistics
Z
Zongyuan Geand
Monash University, Melbourne, Australia
Imran Razzak
Imran Razzak
MBZUAI, Abu Dhabi
Human-Centered AIMedical Image AnalysisMedical Artificial IntelligenceComputational Biology