🤖 AI Summary
Unobserved confounding in observational data biases causal effect estimation. Method: This paper proposes a novel framework integrating latent variable modeling with double machine learning (DML). Specifically, it explicitly incorporates a latent variable model into the second stage of DML, decoupling representation learning from latent inference and separately modeling heterogeneous effects of latent variables on both treatment and outcome. Contribution/Results: To our knowledge, this is the first work to systematically embed structured latent variable modeling into the DML pipeline. Extensive experiments on synthetic data and multiple real-world benchmark datasets demonstrate that the proposed method significantly outperforms state-of-the-art causal inference approaches, achieving superior robustness and estimation accuracy under unmeasured confounding.
📝 Abstract
Latent variable models provide a powerful framework for incorporating and inferring unobserved factors in observational data. In causal inference, they help account for hidden factors influencing treatment or outcome, thereby addressing challenges posed by missing or unmeasured covariates. This paper proposes a new framework that integrates latent variable modeling into the double machine learning (DML) paradigm to enable robust causal effect estimation in the presence of such hidden factors. We consider two scenarios: one where a latent variable affects only the outcome, and another where it may influence both treatment and outcome. To ensure tractability, we incorporate latent variables only in the second stage of DML, separating representation learning from latent inference. We demonstrate the robustness and effectiveness of our method through extensive experiments on both synthetic and real-world datasets.