🤖 AI Summary
Identifying stable predictive relationships—i.e., invariant features—across multiple environments is crucial for out-of-distribution generalization and causal mechanism discovery. Method: This paper proposes Bayesian Invariant Prediction (BIP), a framework that formulates invariant feature selection as a latent variable inference problem under a Bayesian paradigm. Contribution/Results: Theoretically, we establish posterior consistency of BIP and, for the first time, demonstrate that environmental heterogeneity accelerates convergence. Methodologically, we design VI-BIP, a scalable variational inference algorithm that significantly improves computational efficiency in high-dimensional settings. Empirically, BIP and VI-BIP consistently outperform state-of-the-art methods in prediction accuracy, robustness to distributional shifts, and computational efficiency. By unifying statistical rigor with practical scalability, BIP provides a principled new paradigm for invariant learning.
📝 Abstract
Invariant prediction [Peters et al., 2016] analyzes feature/outcome data from multiple environments to identify invariant features - those with a stable predictive relationship to the outcome. Such features support generalization to new environments and help reveal causal mechanisms. Previous methods have primarily tackled this problem through hypothesis testing or regularized optimization. Here we develop Bayesian Invariant Prediction (BIP), a probabilistic model for invariant prediction. BIP encodes the indices of invariant features as a latent variable and recover them by posterior inference. Under the assumptions of Peters et al. [2016], the BIP posterior targets the true invariant features. We prove that the posterior is consistent and that greater environment heterogeneity leads to faster posterior contraction. To handle many features, we design an efficient variational approximation called VI-BIP. In simulations and real data, we find that BIP and VI-BIP are more accurate and scalable than existing methods for invariant prediction.