🤖 AI Summary
Unobserved confounding invalidates conventional adjustment strategies (e.g., backdoor adjustment). This paper focuses on the “napkin graph”—a unifying causal structure subsuming M-bias, instrumental variable, and canonical front- and backdoor settings—and proposes a nonstandard identification framework for the average treatment effect (ATE) based on the ratio of two g-formula expressions. We develop, for the first time, a doubly robust first-order estimator and targeted minimum loss estimation (TMLE) tailored to this structure, enabling asymptotically linear inference even under slow convergence rates of machine learning nuisance estimators. Additionally, we introduce Verma constraints to improve semiparametric efficiency. Simulations demonstrate substantial gains in estimation accuracy and confidence interval coverage. Applied to the Finnish Life Course Study, our method robustly identifies the causal effect of education on income. An accompanying open-source R package, *napkincausal*, enables full reproducibility.
📝 Abstract
Unmeasured confounding can render identification strategies based on adjustment functionals invalid. We study the "Napkin graph", a causal structure that encapsulates patterns of M-bias, instrumental variables, and the classical back-door and front-door models within a single graphical framework, yet requires a nonstandard identification strategy: the average treatment effect is expressed as a ratio of two g-formulas. We develop novel estimators for this functional, including doubly robust one-step and targeted minimum loss-based estimators that remain asymptotically linear when nuisance functions are estimated at slower-than-parametric rates using machine learning. We also show how a generalized independence restriction encoded by the Napkin graph, known as a Verma constraint, can be exploited to improve efficiency, illustrating more generally how such constraints in hidden variable DAGs can inform semiparametric inference. The proposed methods are validated through simulations and applied to the Finnish Life Course study to estimate the effect of educational attainment on income. An accompanying R package, napkincausal, implements all proposed procedures.