The Finer the Better: Towards Granular-aware Open-set Domain Generalization

📅 2025-11-21

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Existing open-set domain generalization (OSDG) methods struggle to jointly optimize structural risk and open-space risk, particularly yielding overconfident predictions for “hard unknown” classes with fine-grained visual similarity to known categories. To address this, we propose a semantic-enhanced OSDG framework. First, we introduce semantic-aware prompt learning to explicitly encode semantic priors of known classes. Second, we design a dual-contrastive learning mechanism that jointly optimizes the decision boundary via “known-known cohesion” and “known-unknown separation.” Third, we leverage a CLIP-guided semantic diffusion model to synthesize high-fidelity pseudo-unknown samples, thereby strengthening hard negative learning. Evaluated on five standard benchmarks, our method achieves average improvements of +3.0% in accuracy and +5.0% in H-score over state-of-the-art approaches. It is the first OSDG framework to achieve decoupled modeling of known and unknown risks under fine-grained semantic guidance.

Technology Category

Application Category

📝 Abstract

Open-Set Domain Generalization (OSDG) tackles the realistic scenario where deployed models encounter both domain shifts and novel object categories. Despite impressive progress with vision-language models like CLIP, existing methods still fall into the dilemma between structural risk of known-classes and open-space risk from unknown-classes, and easily suffers from over-confidence, especially when distinguishing ``hard unknowns" that share fine-grained visual similarities with known classes. To this end, we propose a Semantic-enhanced CLIP (SeeCLIP) framework that explicitly addresses this dilemma through fine-grained semantic enhancement. In SeeCLIP, we propose a semantic-aware prompt enhancement module to decompose images into discriminative semantic tokens, enabling nuanced vision-language alignment beyond coarse category labels. To position unknown prompts effectively, we introduce duplex contrastive learning with complementary objectives, that is, repulsion to maintain separability from known classes, and cohesion to preserve semantic proximity. Further, our semantic-guided diffusion module synthesizes pseudo-unknowns by perturbing extracted semantic tokens, generating challenging samples that are visually similar to known classes yet exhibit key local differences. These hard negatives force the model to learn finer decision boundaries. Extensive experiments across five benchmarks demonstrate consistent improvements of 3% accuracy and 5% H-score over state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Addresses domain shifts and novel categories in open-set generalization

Mitigates over-confidence when distinguishing visually similar unknowns

Resolves structural risk versus open-space risk dilemma through fine-grained semantics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-aware prompt enhancement for fine-grained alignment

Duplex contrastive learning with repulsion and cohesion

Semantic-guided diffusion generates hard negative samples

🔎 Similar Papers

Fine-Grained Domain Generalization with Feature Structuralization