Collaborative Learning with Multiple Foundation Models for Source-Free Domain Adaptation

📅 2025-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Source-free domain adaptation (SFDA) suffers from insufficient semantic coverage by single models and poor robustness to domain shifts. To address this, we propose CoMA, a collaborative multi-foundation-model adaptation framework that pioneers the integration of multiple pre-trained vision-language models (e.g., CLIP, BLIP) into SFDA. CoMA employs a bidirectional adaptation mechanism to jointly leverage global semantics and local contextual representations across models, and introduces a Decomposed Mutual Information (DMI) module to suppress spurious correlations and enhance robust knowledge transfer. The method operates without source data, supports mini-batch training, and unifies semantic alignment with knowledge distillation. Extensive experiments on Office-31, Office-Home, DomainNet-126, and VisDA demonstrate that CoMA consistently outperforms state-of-the-art methods under closed-set, partial-set, and open-set SFDA settings, comprehensively validating the efficacy of collaborative multi-model adaptation for improving target-domain generalization.

Technology Category

Application Category

📝 Abstract
Source-Free Domain Adaptation (SFDA) aims to adapt a pre-trained source model to an unlabeled target domain without access to source data. Recent advances in Foundation Models (FMs) have introduced new opportunities for leveraging external semantic knowledge to guide SFDA. However, relying on a single FM is often insufficient, as it tends to bias adaptation toward a restricted semantic coverage, failing to capture diverse contextual cues under domain shift. To overcome this limitation, we propose a Collaborative Multi-foundation Adaptation (CoMA) framework that jointly leverages two different FMs (e.g., CLIP and BLIP) with complementary properties to capture both global semantics and local contextual cues. Specifically, we employ a bidirectional adaptation mechanism that (1) aligns different FMs with the target model for task adaptation while maintaining their semantic distinctiveness, and (2) transfers complementary knowledge from the FMs to the target model. To ensure stable adaptation under mini-batch training, we introduce Decomposed Mutual Information (DMI) that selectively enhances true dependencies while suppressing false dependencies arising from incomplete class coverage. Extensive experiments demonstrate that our method consistently outperforms existing state-of-the-art SFDA methods across four benchmarks, including Office-31, Office-Home, DomainNet-126, and VisDA, under the closed-set setting, while also achieving best results on partial-set and open-set variants.
Problem

Research questions and friction points this paper is trying to address.

Overcoming single foundation model bias in source-free domain adaptation
Leveraging complementary foundation models for global and local cues
Ensuring stable adaptation with incomplete class coverage in SFDA
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages two complementary foundation models for adaptation
Uses bidirectional mechanism to align and transfer knowledge
Introduces Decomposed Mutual Information for stable training
🔎 Similar Papers
No similar papers found.