🤖 AI Summary
To address the challenge of identifying stable causal variables under multi-source heterogeneous environments where causal mechanisms shift across contexts—rendering conventional methods ineffective—this paper proposes the FAIR framework. FAIR couples adversarial invariance regularization with focused minimax optimization, enabling low-dimensional, structure-adaptive discovery of quasi-causal variables in a nonparametric setting. We formalize pragmatic causality and establish its falsifiability, proving convergence under minimal identifiability conditions and exact recovery of the true causal mechanism when environmental heterogeneity is sufficiently rich. The method is instantiated via the FAIR-NN model, integrating Gumbel-Softmax with low-temperature approximation and stochastic gradient descent-ascent (SGDA), and employs adversarial testing to drive invariance learning. Extensive experiments on synthetic and real-world datasets demonstrate significant improvements in quasi-causal variable identification accuracy.
📝 Abstract
Pursuing causality from data is a fundamental problem in scientific discovery, treatment intervention, and transfer learning. This paper introduces a novel algorithmic method for addressing nonparametric invariance and causality learning in regression models across multiple environments, where the joint distribution of response variables and covariates varies, but the conditional expectations of outcome given an unknown set of quasi-causal variables are invariant. The challenge of finding such an unknown set of quasi-causal or invariant variables is compounded by the presence of endogenous variables that have heterogeneous effects across different environments. The proposed Focused Adversarial Invariant Regularization (FAIR) framework utilizes an innovative minimax optimization approach that drives regression models toward prediction-invariant solutions through adversarial testing. Leveraging the representation power of neural networks, FAIR neural networks (FAIR-NN) are introduced for causality pursuit. It is shown that FAIR-NN can find the invariant variables and quasi-causal variables under a minimal identification condition and that the resulting procedure is adaptive to low-dimensional composition structures in a non-asymptotic analysis. Under a structural causal model, variables identified by FAIR-NN represent pragmatic causality and provably align with exact causal mechanisms under conditions of sufficient heterogeneity. Computationally, FAIR-NN employs a novel Gumbel approximation with decreased temperature and a stochastic gradient descent ascent algorithm. The procedures are demonstrated using simulated and real-data examples.