Average Causal Effect Estimation in DAGs with Hidden Variables: Extensions of Back-Door and Front-Door Criteria

📅 2024-09-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

254K/year

🤖 AI Summary

Estimating the average causal effect (ACE) in directed acyclic graphs (DAGs) with latent variables remains challenging—particularly for continuous variables—due to computational intractability of the g-formula, boundary violations in estimators, and lack of asymptotic theoretical guarantees in existing machine learning approaches. Method: We propose a first-order corrected plug-in estimator and a targeted minimum loss estimator (TMLE), extending the backdoor and frontdoor criteria to primitively fixable graph structures. Our framework integrates nonparametric/semiparametric modeling, stable density ratio estimation, and boundary-preserving optimization. Contribution/Results: We establish the first machine learning–compatible framework achieving double robustness, √n-consistency, semiparametric efficiency, and explicit parameter-space constraints. We characterize the L₂(P)-convergence rate requirements for nuisance function estimators and substantially improve estimation accuracy and statistical inference reliability. An open-source R package, flexCausal, implements automated identification and estimation.

Technology Category

Application Category

📝 Abstract

The identification theory for causal effects in directed acyclic graphs (DAGs) with hidden variables is well-developed, but methods for estimating and inferring functionals beyond the g-formula remain limited. Previous studies have proposed semiparametric estimators for identifiable functionals in a broad class of DAGs with hidden variables. While demonstrating double robustness in some models, existing estimators face challenges, particularly with density estimation and numerical integration for continuous variables, and their estimates may fall outside the parameter space of the target estimand. Their asymptotic properties are also underexplored, especially when using flexible statistical and machine learning models for nuisance estimation. This study addresses these challenges by introducing novel one-step corrected plug-in and targeted minimum loss-based estimators of causal effects for a class of DAGs that extend classical back-door and front-door criteria (known as the treatment primal fixability criterion in prior literature). These estimators leverage machine learning to minimize modeling assumptions while ensuring key statistical properties such as asymptotic linearity, double robustness, efficiency, and staying within the bounds of the target parameter space. We establish conditions for nuisance functional estimates in terms of L2(P)-norms to achieve root-n consistent causal effect estimates. To facilitate practical application, we have developed the flexCausal package in R.

Problem

Research questions and friction points this paper is trying to address.

Estimating causal effects in DAGs with hidden variables

Overcoming computational challenges in semiparametric causal estimation

Ensuring statistical properties with machine learning integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

One-step corrected plug-in estimators for causal effects

Targeted minimum loss-based estimators with machine learning

Ensures double robustness, efficiency, and asymptotic linearity

🔎 Similar Papers

Toward identifiability of total effects in summary causal graphs with latent confounders: an extension of the front-door criterion