InfoSAM: Fine-Tuning the Segment Anything Model from An Information-Theoretic Perspective

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

To address the weak zero-shot transferability of the Segment Anything Model (SAM) to specialized domains, this paper proposes a mutual information-driven parameter-efficient fine-tuning (PEFT) method. Our approach is the first to jointly model mutual information maximization and pseudo-invariant information compression, establishing a domain-invariant relational distillation framework. Built upon LoRA adaptation, it enables teacher–student collaborative knowledge transfer while preserving SAM’s pre-trained domain-invariant segmentation structures. Evaluated on multiple specialized segmentation benchmarks—including medical imaging and remote sensing—our method consistently outperforms mainstream PEFT baselines, achieving average mIoU improvements of 3.2–5.7 percentage points. The results demonstrate strong cross-domain generalization and validate the effectiveness and advancement of an information-theoretic, structure-aware fine-tuning paradigm for domain-specific vision tasks.

Technology Category

Application Category

📝 Abstract

The Segment Anything Model (SAM), a vision foundation model, exhibits impressive zero-shot capabilities in general tasks but struggles in specialized domains. Parameter-efficient fine-tuning (PEFT) is a promising approach to unleash the potential of SAM in novel scenarios. However, existing PEFT methods for SAM neglect the domain-invariant relations encoded in the pre-trained model. To bridge this gap, we propose InfoSAM, an information-theoretic approach that enhances SAM fine-tuning by distilling and preserving its pre-trained segmentation knowledge. Specifically, we formulate the knowledge transfer process as two novel mutual information-based objectives: (i) to compress the domain-invariant relation extracted from pre-trained SAM, excluding pseudo-invariant information as possible, and (ii) to maximize mutual information between the relational knowledge learned by the teacher (pre-trained SAM) and the student (fine-tuned model). The proposed InfoSAM establishes a robust distillation framework for PEFT of SAM. Extensive experiments across diverse benchmarks validate InfoSAM's effectiveness in improving SAM family's performance on real-world tasks, demonstrating its adaptability and superiority in handling specialized scenarios.

Problem

Research questions and friction points this paper is trying to address.

Enhancing SAM fine-tuning for specialized domains

Preserving domain-invariant relations in pre-trained SAM

Improving SAM's performance in real-world specialized tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

InfoSAM enhances SAM with information-theoretic fine-tuning

Uses mutual information to preserve domain-invariant relations

Robust distillation framework for parameter-efficient SAM adaptation

🔎 Similar Papers

On Efficient Variants of Segment Anything Model: A Survey