Sufficient Invariant Learning for Distribution Shift

📅 2022-10-24
🏛️ Computer Vision and Pattern Recognition
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Existing invariant feature learning methods for robustness under distribution shifts rely on the strong assumption that invariant features are fully observable in both training and test distributions—a condition frequently violated in practice, leading to poor generalization. Method: We propose Sufficient Invariant Learning (SIL), the first framework to formally define and model *sufficient invariant features*: the minimal subset of invariant features capable of independently supporting robust prediction. We theoretically prove that co-flat minima can accommodate diverse sufficient invariant features. Methodologically, we design Adaptive Sharpness-Aware Group Distributionally Robust Optimization (ASGDRO), which jointly enforces inter-environment flatness alignment and diversity regularization over invariant features. Results: SIL achieves significant improvements over state-of-the-art methods across multiple benchmarks and newly constructed datasets, demonstrating superior robustness against various types of distribution shifts.
📝 Abstract
Learning robust models under distribution shifts between training and test datasets is a fundamental challenge in machine learning. While learning invariant features across environments is a popular approach, it often assumes that these features are fully observed in both training and test sets—a condition frequently violated in practice. When models rely on invariant features absent in the test set, their robustness in new environments can deteriorate. To tackle this problem, we introduce a novel learning principle called the Sufficient Invariant Learning (SIL) framework, which focuses on learning a sufficient subset of invariant features rather than relying on a single feature. After demonstrating the limitation of existing invariant learning methods, we propose a new algorithm, Adaptive Sharpness-aware Group Distributionally Robust Optimization (ASGDRO), to learn diverse invariant features by seeking common flat minima across the environments. We theoretically demonstrate that finding a common flat minima enables robust predictions based on diverse invariant features. Empirical evaluations on multiple datasets, including our new benchmark, confirm ASGDRO’s robustness against distribution shifts, highlighting the limitations of existing methods. Code: https://github.com/MLAI-Yonsei/SIL-ASGDRO.
Problem

Research questions and friction points this paper is trying to address.

Learning robust models under distribution shift
Addressing incomplete invariant features in test sets
Ensuring model robustness with diverse invariant features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sufficient Invariant Learning framework
Adaptive Sharpness-aware Group Optimization
seeking common flat minima
🔎 Similar Papers
No similar papers found.