The Finer the Better: Towards Granular-aware Open-set Domain Generalization

📅 2025-11-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing open-set domain generalization (OSDG) methods struggle to jointly optimize structural risk and open-space risk, particularly yielding overconfident predictions for “hard unknown” classes with fine-grained visual similarity to known categories. To address this, we propose a semantic-enhanced OSDG framework. First, we introduce semantic-aware prompt learning to explicitly encode semantic priors of known classes. Second, we design a dual-contrastive learning mechanism that jointly optimizes the decision boundary via “known-known cohesion” and “known-unknown separation.” Third, we leverage a CLIP-guided semantic diffusion model to synthesize high-fidelity pseudo-unknown samples, thereby strengthening hard negative learning. Evaluated on five standard benchmarks, our method achieves average improvements of +3.0% in accuracy and +5.0% in H-score over state-of-the-art approaches. It is the first OSDG framework to achieve decoupled modeling of known and unknown risks under fine-grained semantic guidance.

Technology Category

Application Category

📝 Abstract
Open-Set Domain Generalization (OSDG) tackles the realistic scenario where deployed models encounter both domain shifts and novel object categories. Despite impressive progress with vision-language models like CLIP, existing methods still fall into the dilemma between structural risk of known-classes and open-space risk from unknown-classes, and easily suffers from over-confidence, especially when distinguishing ``hard unknowns" that share fine-grained visual similarities with known classes. To this end, we propose a Semantic-enhanced CLIP (SeeCLIP) framework that explicitly addresses this dilemma through fine-grained semantic enhancement. In SeeCLIP, we propose a semantic-aware prompt enhancement module to decompose images into discriminative semantic tokens, enabling nuanced vision-language alignment beyond coarse category labels. To position unknown prompts effectively, we introduce duplex contrastive learning with complementary objectives, that is, repulsion to maintain separability from known classes, and cohesion to preserve semantic proximity. Further, our semantic-guided diffusion module synthesizes pseudo-unknowns by perturbing extracted semantic tokens, generating challenging samples that are visually similar to known classes yet exhibit key local differences. These hard negatives force the model to learn finer decision boundaries. Extensive experiments across five benchmarks demonstrate consistent improvements of 3% accuracy and 5% H-score over state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses domain shifts and novel categories in open-set generalization
Mitigates over-confidence when distinguishing visually similar unknowns
Resolves structural risk versus open-space risk dilemma through fine-grained semantics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-aware prompt enhancement for fine-grained alignment
Duplex contrastive learning with repulsion and cohesion
Semantic-guided diffusion generates hard negative samples
🔎 Similar Papers
No similar papers found.
Y
Yunyun Wang
School of Computer Science, University of Posts and Telecommunications, Nanjing, China
Zheng Duan
Zheng Duan
School of Computer Science, University of Posts and Telecommunications, Nanjing, China
X
Xinyue Liao
School of Computer Science, University of Posts and Telecommunications, Nanjing, China
K
Ke-Jia Chen
School of Computer Science, University of Posts and Telecommunications, Nanjing, China
Songcan Chen
Songcan Chen
Nanjing University of Aeronautics & Astronautics
Machine LearningPattern recognition