🤖 AI Summary
To address insufficient robustness of domain generalization (DG) models on unseen test domains, this paper pioneers the introduction of set-valued prediction—replacing conventional point predictions—in DG. We propose a differentiable, end-to-end set-valued learning framework: (i) enforcing theoretically grounded cross-domain coverage constraints; (ii) formulating a joint optimization objective balancing set compactness and robustness; and (iii) integrating multi-domain training with modern neural architectures. Evaluated on multiple real-world benchmarks from WILDS, our method significantly improves predictive stability and generalization performance on unseen domains, empirically validating the efficacy and promise of set-valued prediction in DG. Key contributions include: (1) the first set-valued prediction paradigm tailored for DG; (2) the first theoretical framework supporting statistically guaranteed cross-domain coverage; and (3) a differentiable, compact, and robust set-output mechanism.
📝 Abstract
Despite the impressive advancements in modern machine learning, achieving robustness in Domain Generalization (DG) tasks remains a significant challenge. In DG, models are expected to perform well on samples from unseen test distributions (also called domains), by learning from multiple related training distributions. Most existing approaches to this problem rely on single-valued predictions, which inherently limit their robustness. We argue that set-valued predictors could be leveraged to enhance robustness across unseen domains, while also taking into account that these sets should be as small as possible. We introduce a theoretical framework defining successful set prediction in the DG setting, focusing on meeting a predefined performance criterion across as many domains as possible, and provide theoretical insights into the conditions under which such domain generalization is achievable. We further propose a practical optimization method compatible with modern learning architectures, that balances robust performance on unseen domains with small prediction set sizes. We evaluate our approach on several real-world datasets from the WILDS benchmark, demonstrating its potential as a promising direction for robust domain generalization.