🤖 AI Summary
In causal inference, machine learning is widely used to estimate propensity scores and outcome models, yet practical guidance for constructing target minimum loss estimators (TMLEs) that balance theoretical rigor with implementability remains scarce for applied researchers.
Method: Building on the efficient influence function framework, we propose a modular, reproducible TMLE construction procedure: first fit auxiliary models using arbitrary machine learning methods; then perform a one-dimensional targeted update to correct bias.
Contribution/Results: Our key innovation lies in translating abstract efficiency theory into three intuitive steps—initial estimation, efficient influence function computation, and parameter update—substantially lowering the conceptual and implementation barriers to TMLE. The estimator retains double robustness and asymptotic efficiency under model misspecification. By bridging the critical gap between statistical theory and empirical practice, our approach enables non-statisticians to conduct reliable, machine learning–enhanced causal analyses.
📝 Abstract
Use of machine learning to estimate nuisance functions (e.g. outcomes models, propensity score models) in estimators used in causal inference is increasingly common, as it can mitigate bias due to model misspecification. However, it can be challenging to achieve valid inference (e.g., estimate valid confidence intervals). The efficient influence function (EIF) provides a recipe to go from a statistical estimand relevant to our causal question, to an estimator that can validly incorporate machine learning. Our companion paper, Renson et al. 2025 (arXiv:2502.05363), provides a thorough but approachable description of the EIF, along with a guide through the steps to go from a unique statistical estimand to development of one type of EIF-based estimator, the so-called one-step estimator. Another commonly used estimator based on the EIF is the targeted maximum likelihood/minimum loss estimator (TMLE). Construction of TMLEs is well-discussed in the statistical literature, but there remains a gap in translation to a more applied audience. In this letter, which supplements Renson et al., we provide a more accessible illustration of how to construct a TMLE.