UniUltra: Interactive Parameter-Efficient SAM2 for Universal Ultrasound Segmentation

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the performance degradation of SAM2 in ultrasound image segmentation—caused by cross-modal discrepancies—and its impracticality for resource-constrained clinical deployment, this paper proposes a parameter-efficient adaptation framework. Our method introduces (1) a Context-Edge Hybrid Adapter (CH-Adapter) to enhance fine-grained anatomical structure perception, and (2) Deeply Supervised Knowledge Distillation (DSKD) to simultaneously achieve model compression and performance preservation. By fine-tuning only 8.91% of SAM2’s parameters, our adapted model reduces total parameters by 94.08% and significantly lowers inference overhead. Evaluated on a multi-center, multi-organ ultrasound dataset, the proposed approach outperforms existing state-of-the-art methods, demonstrating strong cross-modal generalization and practical clinical deployability.

Technology Category

Application Category

📝 Abstract
The Segment Anything Model 2 (SAM2) demonstrates remarkable universal segmentation capabilities on natural images. However, its performance on ultrasound images is significantly degraded due to domain disparities. This limitation raises two critical challenges: how to efficiently adapt SAM2 to ultrasound imaging while maintaining parameter efficiency, and how to deploy the adapted model effectively in resource-constrained clinical environments. To address these issues, we propose UniUltra for universal ultrasound segmentation. Specifically, we first introduce a novel context-edge hybrid adapter (CH-Adapter) that enhances fine-grained perception across diverse ultrasound imaging modalities while achieving parameter-efficient fine-tuning. To further improve clinical applicability, we develop a deep-supervised knowledge distillation (DSKD) technique that transfers knowledge from the large image encoder of the fine-tuned SAM2 to a super lightweight encoder, substantially reducing computational requirements without compromising performance. Extensive experiments demonstrate that UniUltra outperforms state-of-the-arts with superior generalization capabilities. Notably, our framework achieves competitive performance using only 8.91% of SAM2's parameters during fine-tuning, and the final compressed model reduces the parameter count by 94.08% compared to the original SAM2, making it highly suitable for practical clinical deployment. The source code is available at https://github.com/xq141839/UniUltra.
Problem

Research questions and friction points this paper is trying to address.

Adapting SAM2 to ultrasound images efficiently despite domain disparities
Maintaining parameter efficiency while enhancing ultrasound segmentation performance
Deploying adapted models effectively in resource-constrained clinical environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Context-edge hybrid adapter enhances ultrasound fine-grained perception
Deep-supervised knowledge distillation reduces computational requirements
Parameter-efficient fine-tuning achieves competitive performance with fewer parameters
🔎 Similar Papers
No similar papers found.
Y
Yue Li
School of Computer Science, University of Nottingham Ningbo China, Ningbo, Zhejiang, China, and School of Computer Science, University of Nottingham, UK
Q
Qing Xu
School of Computer Science, University of Nottingham Ningbo China, Ningbo, Zhejiang, China
Y
Yixuan Zhang
School of Computer Science, University of Nottingham Ningbo China, Ningbo, Zhejiang, China
Xiangjian He
Xiangjian He
University of Nottingham Ningbo China (2022.5--), University of Technology Sydney (1999.2-2022.5)
Computer VisionMachine LearningData Analytics
Q
Qian Zhang
School of Computer Science, University of Nottingham Ningbo China, Ningbo, Zhejiang, China
Y
Yuan Yao
School of Computer Science, University of Nottingham Ningbo China, Ningbo, Zhejiang, China
F
Fiseha B. Tesem
School of Computer Science, University of Nottingham Ningbo China, Ningbo, Zhejiang, China
X
Xin Chen
School of Computer Science, University of Nottingham, UK
R
Ruili Wang
School of Mathematical and Computational Sciences, Massey University, Auckland 0632, New Zealand, also with the School of Computer Science, University of Nottingham Ningbo China, Ningbo 315104, China, and also with the School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, China
Z
Zhen Chen
Hong Kong Institute of Science & Innovation, Chinese Academy of Sciences, Hong Kong SAR
Chang Wen Chen
Chang Wen Chen
Chair Professor of Visual Computing, The Hong Kong Polytechnic University
multimedia communicationmultimedia systemsimage/video processingmultimedia signal processing