Target Semantics Clustering via Text Representations for Robust Universal Domain Adaptation

📅 2025-04-11
🏛️ AAAI Conference on Artificial Intelligence
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
UniDA confronts dual challenges: domain shift and unknown-class shift. Existing methods estimate target-domain semantic centers in continuous image spaces, suffering from noise sensitivity and difficulty in determining optimal cluster counts, thereby yielding non-robust alignment. This paper pioneers the migration of target-domain semantic clustering into the discrete, semantically grounded text embedding space of a frozen vision-language model. We propose UniMS, a unified scoring function that jointly models shared-class alignment and private-class discovery. Our framework integrates greedy text-space search, gradient-driven fine-tuning, information-maximization optimization, and an open-set discriminative mechanism. Evaluated across four standard UniDA benchmarks and four category-shift scenarios, our approach achieves state-of-the-art performance, significantly improving cross-domain robustness and private-class identification accuracy.

Technology Category

Application Category

📝 Abstract
Universal Domain Adaptation (UniDA) focuses on transferring source domain knowledge to the target domain under both domain shift and unknown category shift. Its main challenge lies in identifying common class samples and aligning them. Current methods typically obtain target domain semantics centers from an unconstrained continuous image representation space. Due to domain shift and the unknown number of clusters, these centers often result in complex and less robust alignment algorithm. In this paper, based on vision-language models, we search for semantic centers in a semantically meaningful and discrete text representation space. The constrained space ensures almost no domain bias and appropriate semantic granularity for these centers, enabling a simple and robust adaptation algorithm. Specifically, we propose TArget Semantics Clustering (TASC) via Text Representations, which leverages information maximization as a unified objective and involves two stages. First, with the frozen encoders, a greedy search-based framework is used to search for an optimal set of text embeddings to represent target semantics. Second, with the search results fixed, encoders are refined based on gradient descent, simultaneously achieving robust domain alignment and private class clustering. Additionally, we propose Universal Maximum Similarity (UniMS), a scoring function tailored for detecting open-set samples in UniDA. Experimentally, we evaluate the universality of UniDA algorithms under four category shift scenarios. Extensive experiments on four benchmarks demonstrate the effectiveness and robustness of our method, which has achieved state-of-the-art performance.
Problem

Research questions and friction points this paper is trying to address.

Identifying common class samples under domain shift
Aligning target domain semantics with constrained text space
Detecting open-set samples in Universal Domain Adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses text representation space for clustering
Leverages vision-language models for semantics
Proposes Universal Maximum Similarity scoring
🔎 Similar Papers
No similar papers found.
Weinan He
Weinan He
University of Science and Technology of China
Domain adaptationVision-language models
Zilei Wang
Zilei Wang
University of Science and Technology of China
Computer VisionDeep LearningPattern Recognition
Y
Yixin Zhang
University of Science and Technology of China, Hefei, China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center