Targeted Deep Architectures: A TMLE-Based Framework for Robust Causal Inference in Neural Networks

📅 2025-07-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep neural networks excel at prediction but suffer from substantial bias and high computational cost in unbiased, efficient inference of causal parameters—such as average treatment effects or survival curves. To address this, we propose Targeted Deep Architectures (TDA), the first framework to directly embed Targeted Maximum Likelihood Estimation (TMLE) into neural network parameter updates. TDA employs a parameter-splitting mechanism: it freezes the backbone weights while iteratively updating only a small-scale “target” subparameter along the doubly robust influence function direction. The method is architecture-agnostic and unifies debiased estimation and asymptotically efficient inference for multivariate causal parameters. Evaluated on the Infant Health and Development Program (IHDP) benchmark and survival data with informative censoring, TDA significantly reduces estimation bias and improves confidence interval coverage. It achieves theoretical rigor—guaranteeing asymptotic normality and semiparametric efficiency—while maintaining computational scalability.

Technology Category

Application Category

📝 Abstract
Modern deep neural networks are powerful predictive tools yet often lack valid inference for causal parameters, such as treatment effects or entire survival curves. While frameworks like Double Machine Learning (DML) and Targeted Maximum Likelihood Estimation (TMLE) can debias machine-learning fits, existing neural implementations either rely on "targeted losses" that do not guarantee solving the efficient influence function equation or computationally expensive post-hoc "fluctuations" for multi-parameter settings. We propose Targeted Deep Architectures (TDA), a new framework that embeds TMLE directly into the network's parameter space with no restrictions on the backbone architecture. Specifically, TDA partitions model parameters - freezing all but a small "targeting" subset - and iteratively updates them along a targeting gradient, derived from projecting the influence functions onto the span of the gradients of the loss with respect to weights. This procedure yields plug-in estimates that remove first-order bias and produce asymptotically valid confidence intervals. Crucially, TDA easily extends to multi-dimensional causal estimands (e.g., entire survival curves) by merging separate targeting gradients into a single universal targeting update. Theoretically, TDA inherits classical TMLE properties, including double robustness and semiparametric efficiency. Empirically, on the benchmark IHDP dataset (average treatment effects) and simulated survival data with informative censoring, TDA reduces bias and improves coverage relative to both standard neural-network estimators and prior post-hoc approaches. In doing so, TDA establishes a direct, scalable pathway toward rigorous causal inference within modern deep architectures for complex multi-parameter targets.
Problem

Research questions and friction points this paper is trying to address.

Ensures valid causal inference in deep neural networks
Reduces bias in multi-parameter causal estimands efficiently
Provides asymptotically valid confidence intervals for complex targets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Embeds TMLE directly into neural network architecture
Uses targeting gradient for bias reduction
Extends to multi-dimensional causal estimands
🔎 Similar Papers
No similar papers found.