How Many Domains Suffice for Domain Generalization? A Tight Characterization via the Domain Shattering Dimension

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the fundamental sample complexity problem in domain generalization: how many source domains must be randomly drawn from a family of data distributions—potentially including unseen domains—to guarantee robust model performance across all domains in the family? To this end, we introduce *domain shattering dimension*, a novel combinatorial complexity measure that tightly characterizes domain sample complexity for the first time, and establish its precise quantitative relationship with the VC dimension. Within the PAC learning framework, integrating combinatorial analysis and statistical learning theory, we develop the first learnability theory for domain generalization: we prove that any hypothesis class PAC-learnable under standard assumptions remains learnable in this setting, and we derive necessary and sufficient conditions on the number of source domains required—achieving theoretically optimal characterization of domain complexity.

Technology Category

Application Category

📝 Abstract
We study a fundamental question of domain generalization: given a family of domains (i.e., data distributions), how many randomly sampled domains do we need to collect data from in order to learn a model that performs reasonably well on every seen and unseen domain in the family? We model this problem in the PAC framework and introduce a new combinatorial measure, which we call the domain shattering dimension. We show that this dimension characterizes the domain sample complexity. Furthermore, we establish a tight quantitative relationship between the domain shattering dimension and the classic VC dimension, demonstrating that every hypothesis class that is learnable in the standard PAC setting is also learnable in our setting.
Problem

Research questions and friction points this paper is trying to address.

Characterize domain sample complexity for generalization
Introduce domain shattering dimension as new measure
Relate domain shattering dimension to VC dimension
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces domain shattering dimension measure
Links domain sample complexity to VC dimension
PAC framework for domain generalization
🔎 Similar Papers
No similar papers found.