🤖 AI Summary
Existing domain generalization methods (e.g., IRMv1, VREx) often fail to converge to the optimal invariant predictor under label noise due to misaligned optimization objectives. This paper establishes, for the first time from a causal perspective, that invariance of the representation–label correlation across environments is a necessary condition for the optimal invariant predictor under label noise. Building on this theoretical insight, we propose a novel learning principle—provably convergent and robust to label noise—that explicitly models invariant correlations within the IRM framework. Our approach integrates covariance-constrained regularization with gradient alignment to enforce environment-invariant predictive relationships. Extensive experiments demonstrate consistent and significant improvements over IRMv1, VREx, and other baselines on multiple noisy domain generalization benchmarks. Theoretical analysis aligns closely with empirical results, and our code is publicly available.
📝 Abstract
The Invariant Risk Minimization (IRM) approach aims to address the challenge of domain generalization by training a feature representation that remains invariant across multiple environments. However, in noisy environments, IRM-related techniques such as IRMv1 and VREx may be unable to achieve the optimal IRM solution, primarily due to erroneous optimization directions. To address this issue, we introduce ICorr (an abbreviation for Invariant Correlation), a novel approach designed to surmount the above challenge in noisy settings. Additionally, we dig into a case study to analyze why previous methods may lose ground while ICorr can succeed. Through a theoretical lens, particularly from a causality perspective, we illustrate that the invariant correlation of representation with label is a necessary condition for the optimal invariant predictor in noisy environments, whereas the optimization motivations for other methods may not be. Furthermore, we empirically demonstrate the effectiveness of ICorr by comparing it with other domain generalization methods on various noisy datasets. The code is available at https://github.com/Alexkael/ICorr.