Bayesian Invariance Modeling of Multi-Environment Data

📅 2025-06-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Identifying stable predictive relationships—i.e., invariant features—across multiple environments is crucial for out-of-distribution generalization and causal mechanism discovery. Method: This paper proposes Bayesian Invariant Prediction (BIP), a framework that formulates invariant feature selection as a latent variable inference problem under a Bayesian paradigm. Contribution/Results: Theoretically, we establish posterior consistency of BIP and, for the first time, demonstrate that environmental heterogeneity accelerates convergence. Methodologically, we design VI-BIP, a scalable variational inference algorithm that significantly improves computational efficiency in high-dimensional settings. Empirically, BIP and VI-BIP consistently outperform state-of-the-art methods in prediction accuracy, robustness to distributional shifts, and computational efficiency. By unifying statistical rigor with practical scalability, BIP provides a principled new paradigm for invariant learning.

Technology Category

Application Category

📝 Abstract
Invariant prediction [Peters et al., 2016] analyzes feature/outcome data from multiple environments to identify invariant features - those with a stable predictive relationship to the outcome. Such features support generalization to new environments and help reveal causal mechanisms. Previous methods have primarily tackled this problem through hypothesis testing or regularized optimization. Here we develop Bayesian Invariant Prediction (BIP), a probabilistic model for invariant prediction. BIP encodes the indices of invariant features as a latent variable and recover them by posterior inference. Under the assumptions of Peters et al. [2016], the BIP posterior targets the true invariant features. We prove that the posterior is consistent and that greater environment heterogeneity leads to faster posterior contraction. To handle many features, we design an efficient variational approximation called VI-BIP. In simulations and real data, we find that BIP and VI-BIP are more accurate and scalable than existing methods for invariant prediction.
Problem

Research questions and friction points this paper is trying to address.

Identify invariant features in multi-environment data
Develop Bayesian model for invariant feature prediction
Improve accuracy and scalability of invariant prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian probabilistic model for invariant prediction
Latent variable encoding for invariant features
Efficient variational approximation VI-BIP
🔎 Similar Papers
No similar papers found.
Luhuan Wu
Luhuan Wu
PhD student, Columbia University
machine learningstatistics
Mingzhang Yin
Mingzhang Yin
Assistant Professor, University of Florida
Bayesian StatisticsMachine LearningCausal InferenceQuantitative Marketing
Y
Yixin Wang
Department of Statistics, University of Michigan
J
John P. Cunningham
Department of Statistics, Columbia University
D
David M. Blei
Department of Statistics, Columbia University; Department of Computer Science, Columbia University