Nonparametric Identification and Inference for Counterfactual Distributions with Confounding

πŸ“… 2026-02-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes the first framework for identifying counterfactual distributions that integrates causal representation learning with semiparametric theory in the presence of both observed and unobserved confounding. By modeling covariate dependence structures via conditional copulas, the method constructs covariate-augmented sharp bounds and employs a log-sum-exp smooth approximation to enable differentiable optimization. To address unobserved confounding, it combines instrumental variables, triple machine learning, and variational autoencoders, facilitating robust individual-level counterfactual inference. Theoretical analysis establishes the asymptotic distribution of the estimator while accounting for representation learning error and provides sufficient conditions for achieving semiparametric efficiency. Both simulation studies and empirical applications demonstrate the method’s validity and superior performance.

Technology Category

Application Category

πŸ“ Abstract
We propose nonparametric identification and semiparametric estimation of joint potential outcome distributions in the presence of confounding. First, in settings with observed confounding, we derive tighter, covariate-informed bounds on the joint distribution by leveraging conditional copulas. To overcome the non-differentiability of bounding min/max operators, we establish the asymptotic properties for both a direct estimator with polynomial margin condition and a smooth approximation with log-sum-exp operator, facilitating valid inference for individual-level effects under the canonical rank-preserving assumption. Second, we tackle the challenge of unmeasured confounding by introducing a causal representation learning framework. By utilizing instrumental variables, we prove the nonparametric identifiability of the latent confounding subspace under injectivity and completeness conditions. We develop a ``triple machine learning" estimator that employs cross-fitting scheme to sequentially handle the learned representation, nuisance parameters, and target functional. We characterize the asymptotic distribution with variance inflation induced by representation learning error, and provide conditions for semiparametric efficiency. We also propose a practical VAE-based algorithm for confounding representation learning. Simulations and real-world analysis validate the effectiveness of proposed methods. By bridging classical semiparametric theory with modern representation learning, this work provides a robust statistical foundation for distributional and counterfactual inference in complex causal systems.
Problem

Research questions and friction points this paper is trying to address.

counterfactual distributions
confounding
nonparametric identification
causal inference
instrumental variables
Innovation

Methods, ideas, or system contributions that make the work stand out.

nonparametric identification
counterfactual distributions
causal representation learning
triple machine learning
conditional copulas
πŸ”Ž Similar Papers
No similar papers found.