🤖 AI Summary
This paper addresses low-shot open-set domain generalization (LSOSDG), a novel task that confronts two key challenges: (1) the weak domain generalization capability of vision-language models (e.g., CLIP) under extreme low-shot supervision (e.g., 1-shot), and (2) inaccurate rejection of open-set samples exhibiting fine-grained semantic deviations. To tackle these, we propose a domain-agnostic prompt learning framework coupled with pseudo open-set sample synthesis. Specifically, we design learnable, domain- and class-agnostic visual prompts and incorporate a cross-attention module to explicitly model vision–language alignment. Additionally, we leverage foundation models to directionally synthesize pseudo open-set samples, thereby enhancing discriminative capacity for fine-grained unknown classes. Extensive experiments across five benchmarks demonstrate consistent and significant improvements over state-of-the-art methods—achieving superior low-shot domain generalization accuracy and markedly enhanced open-set rejection performance, particularly for fine-grained semantic outliers.
📝 Abstract
We introduce Low-Shot Open-Set Domain Generalization (LSOSDG), a novel paradigm unifying low-shot learning with open-set domain generalization (ODG). While prompt-based methods using models like CLIP have advanced DG, they falter in low-data regimes (e.g., 1-shot) and lack precision in detecting open-set samples with fine-grained semantics related to training classes. To address these challenges, we propose OSLOPROMPT, an advanced prompt-learning framework for CLIP with two core innovations. First, to manage limited supervision across source domains and improve DG, we introduce a domain-agnostic prompt-learning mechanism that integrates adaptable domain-specific cues and visually guided semantic attributes through a novel cross-attention module, besides being supported by learnable domain- and class-generic visual prompts to enhance cross-modal adaptability. Second, to improve outlier rejection during inference, we classify unfamiliar samples as"unknown"and train specialized prompts with systematically synthesized pseudo-open samples that maintain fine-grained relationships to known classes, generated through a targeted query strategy with off-the-shelf foundation models. This strategy enhances feature learning, enabling our model to detect open samples with varied granularity more effectively. Extensive evaluations across five benchmarks demonstrate that OSLOPROMPT establishes a new state-of-the-art in LSOSDG, significantly outperforming existing methods.