Sufficient Invariant Learning for Distribution Shift

📅 2022-10-24

🏛️ Computer Vision and Pattern Recognition

📈 Citations: 2

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Existing invariant feature learning methods for robustness under distribution shifts rely on the strong assumption that invariant features are fully observable in both training and test distributions—a condition frequently violated in practice, leading to poor generalization. Method: We propose Sufficient Invariant Learning (SIL), the first framework to formally define and model *sufficient invariant features*: the minimal subset of invariant features capable of independently supporting robust prediction. We theoretically prove that co-flat minima can accommodate diverse sufficient invariant features. Methodologically, we design Adaptive Sharpness-Aware Group Distributionally Robust Optimization (ASGDRO), which jointly enforces inter-environment flatness alignment and diversity regularization over invariant features. Results: SIL achieves significant improvements over state-of-the-art methods across multiple benchmarks and newly constructed datasets, demonstrating superior robustness against various types of distribution shifts.

📝 Abstract

Learning robust models under distribution shifts between training and test datasets is a fundamental challenge in machine learning. While learning invariant features across environments is a popular approach, it often assumes that these features are fully observed in both training and test sets—a condition frequently violated in practice. When models rely on invariant features absent in the test set, their robustness in new environments can deteriorate. To tackle this problem, we introduce a novel learning principle called the Sufficient Invariant Learning (SIL) framework, which focuses on learning a sufficient subset of invariant features rather than relying on a single feature. After demonstrating the limitation of existing invariant learning methods, we propose a new algorithm, Adaptive Sharpness-aware Group Distributionally Robust Optimization (ASGDRO), to learn diverse invariant features by seeking common flat minima across the environments. We theoretically demonstrate that finding a common flat minima enables robust predictions based on diverse invariant features. Empirical evaluations on multiple datasets, including our new benchmark, confirm ASGDRO’s robustness against distribution shifts, highlighting the limitations of existing methods. Code: https://github.com/MLAI-Yonsei/SIL-ASGDRO.

Problem

Research questions and friction points this paper is trying to address.

Learning robust models under distribution shift

Addressing incomplete invariant features in test sets

Ensuring model robustness with diverse invariant features

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sufficient Invariant Learning framework

Adaptive Sharpness-aware Group Optimization

seeking common flat minima

🔎 Similar Papers

No similar papers found.