InfoSAM: Fine-Tuning the Segment Anything Model from An Information-Theoretic Perspective

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the weak zero-shot transferability of the Segment Anything Model (SAM) to specialized domains, this paper proposes a mutual information-driven parameter-efficient fine-tuning (PEFT) method. Our approach is the first to jointly model mutual information maximization and pseudo-invariant information compression, establishing a domain-invariant relational distillation framework. Built upon LoRA adaptation, it enables teacher–student collaborative knowledge transfer while preserving SAM’s pre-trained domain-invariant segmentation structures. Evaluated on multiple specialized segmentation benchmarks—including medical imaging and remote sensing—our method consistently outperforms mainstream PEFT baselines, achieving average mIoU improvements of 3.2–5.7 percentage points. The results demonstrate strong cross-domain generalization and validate the effectiveness and advancement of an information-theoretic, structure-aware fine-tuning paradigm for domain-specific vision tasks.

Technology Category

Application Category

📝 Abstract
The Segment Anything Model (SAM), a vision foundation model, exhibits impressive zero-shot capabilities in general tasks but struggles in specialized domains. Parameter-efficient fine-tuning (PEFT) is a promising approach to unleash the potential of SAM in novel scenarios. However, existing PEFT methods for SAM neglect the domain-invariant relations encoded in the pre-trained model. To bridge this gap, we propose InfoSAM, an information-theoretic approach that enhances SAM fine-tuning by distilling and preserving its pre-trained segmentation knowledge. Specifically, we formulate the knowledge transfer process as two novel mutual information-based objectives: (i) to compress the domain-invariant relation extracted from pre-trained SAM, excluding pseudo-invariant information as possible, and (ii) to maximize mutual information between the relational knowledge learned by the teacher (pre-trained SAM) and the student (fine-tuned model). The proposed InfoSAM establishes a robust distillation framework for PEFT of SAM. Extensive experiments across diverse benchmarks validate InfoSAM's effectiveness in improving SAM family's performance on real-world tasks, demonstrating its adaptability and superiority in handling specialized scenarios.
Problem

Research questions and friction points this paper is trying to address.

Enhancing SAM fine-tuning for specialized domains
Preserving domain-invariant relations in pre-trained SAM
Improving SAM's performance in real-world specialized tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

InfoSAM enhances SAM with information-theoretic fine-tuning
Uses mutual information to preserve domain-invariant relations
Robust distillation framework for parameter-efficient SAM adaptation
🔎 Similar Papers
No similar papers found.
Y
Yuanhong Zhang
School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China; Ministry of Education Key Laboratory of Intelligent Networks and Network Security, Xi’an Jiaotong University, Xi’an, China
M
Muyao Yuan
School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China; Ministry of Education Key Laboratory of Intelligent Networks and Network Security, Xi’an Jiaotong University, Xi’an, China
Weizhan Zhang
Weizhan Zhang
Professor,Department of Computer Science and Technology, Xi'an Jiaotong University
Multimedia networking
Tieliang Gong
Tieliang Gong
Xi'an Jiaotong University
machine learningstatistical learning theoryinformation theory
W
Wen Wen
School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China; Shaanxi Province Key Laboratory of Big Data Knowledge Engineering, Xi’an Jiaotong University, Xi’an, China
J
Jiangyong Ying
China Telecom E-surfing Vision Technology Co., Ltd, Hangzhou, China
Weijie Shi
Weijie Shi
Hong Kong University of Science and Technology