Closing the Confusion Loop: CLIP-Guided Alignment for Source-Free Domain Adaptation

πŸ“… 2026-02-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of asymmetric and dynamic class confusion in the target domain caused by inter-class visual similarity under source-free domain adaptation settings, where source data are unavailable. To tackle this issue, the authors propose the CLIP-Guided Alignment (CGA) framework, which explicitly models and leverages such confusion to enhance pseudo-label quality and classification performance. CGA introduces a Multi-directional Confusion Awareness (MCA) module to detect directional confusion pairs, a Misclassification-aware CLIP Prompting (MCC) module to generate confusion-aware textual prompts for CLIP, and a Feature Alignment Module (FAM) that aligns source-model features with CLIP’s confusion-guided representations via contrastive learning. Extensive experiments demonstrate that CGA significantly outperforms existing source-free domain adaptation methods, particularly excelling in fine-grained and high-confusion scenarios.

Technology Category

Application Category

πŸ“ Abstract
Source-Free Domain Adaptation (SFDA) tackles the problem of adapting a pre-trained source model to an unlabeled target domain without accessing any source data, which is quite suitable for the field of data security. Although recent advances have shown that pseudo-labeling strategies can be effective, they often fail in fine-grained scenarios due to subtle inter-class similarities. A critical but underexplored issue is the presence of asymmetric and dynamic class confusion, where visually similar classes are unequally and inconsistently misclassified by the source model. Existing methods typically ignore such confusion patterns, leading to noisy pseudo-labels and poor target discrimination. To address this, we propose CLIP-Guided Alignment(CGA), a novel framework that explicitly models and mitigates class confusion in SFDA. Generally, our method consists of three parts: (1) MCA: detects first directional confusion pairs by analyzing the predictions of the source model in the target domain; (2) MCC: leverages CLIP to construct confusion-aware textual prompts (e.g. a truck that looks like a bus), enabling more context-sensitive pseudo-labeling; and (3) FAM: builds confusion-guided feature banks for both CLIP and the source model and aligns them using contrastive learning to reduce ambiguity in the representation space. Extensive experiments on various datasets demonstrate that CGA consistently outperforms state-of-the-art SFDA methods, with especially notable gains in confusion-prone and fine-grained scenarios. Our results highlight the importance of explicitly modeling inter-class confusion for effective source-free adaptation. Our code can be find at https://github.com/soloiro/CGA
Problem

Research questions and friction points this paper is trying to address.

Source-Free Domain Adaptation
Class Confusion
Fine-Grained Recognition
Pseudo-Labeling
Domain Adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Source-Free Domain Adaptation
Class Confusion
CLIP-Guided Alignment
Pseudo-Labeling
Contrastive Learning
πŸ”Ž Similar Papers
No similar papers found.
Shanshan Wang
Shanshan Wang
AnHui University
Domain AdaptationDomain GeneralizationAI for Education
Z
Ziying Feng
State Key Laboratory of Opto-Electronic Information Acquisition and Protection Technology, Institutes of Physical Science and Information Technology, Anhui University, Hefei 230601, China
X
Xiaozheng Shen
State Key Laboratory of Opto-Electronic Information Acquisition and Protection Technology, Institutes of Physical Science and Information Technology, Anhui University, Hefei 230601, China
X
Xun Yang
Department of Electronic Engineering and Information Science, School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China
P
Pichao Wang
amazon, U.S.A.
Z
Zhenwei He
College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400044, China
Xingyi Zhang
Xingyi Zhang
MBZUAI
graph representation learningAI4Sciencegeometric deep learning