đ€ AI Summary
This paper addresses estimation bias in causal mediation analysis arising from confounding bias in direct and indirect effect estimation. It systematically evaluates estimators for both univariate and multivariate (binary and continuous) mediators. Innovatively, it provides the first unified benchmark evaluation of state-of-the-art estimatorsâincluding multiply robust estimators and double machine learningâin multivariate mediation settings, and proposes a comprehensive practical guideline covering identification assumption validation, estimator selection, and implementation. Through extensive simulation studies and empirical analysis using the UK Biobank brain imaging cohort, the methods demonstrate substantial improvements over conventional parametric and nonparametric mediation models across diverse scenarios. Empirical findings reveal that hypertension and obesity exert significant indirect effects on cognitive function predominantly via structural brain changesâparticularly reduced gray matter volumeâhighlighting the critical role of neuroanatomical pathways in cardiometabolicâcognitive associations.
đ Abstract
Mediation analysis breaks down the causal effect of a treatment on an outcome into an indirect effect, acting through a third group of variables called mediators, and a direct effect, operating through other mechanisms. Mediation analysis is hard because confounders between treatment, mediators, and outcome blur effect estimates in observational studies. Many estimators have been proposed to adjust on those confounders and provide accurate causal estimates. We consider parametric and non-parametric implementations of classical estimators and provide a thorough evaluation for the estimation of the direct and indirect effects in the context of causal mediation analysis for binary, continuous, and multi-dimensional mediators. We assess several approaches in a comprehensive benchmark on simulated data. Our results show that advanced statistical approaches such as the multiply robust and the double machine learning estimators achieve good performances in most of the simulated settings and on real data. As an example of application, we propose a thorough analysis of factors known to influence cognitive functions to assess if the mechanism involves modifications in brain morphology using the UK Biobank brain imaging cohort. This analysis shows that for several physiological factors, such as hypertension and obesity, a substantial part of the effect is mediated by changes in the brain structure. This work provides guidance to the practitioner from the formulation of a valid causal mediation problem, including the verification of the identification assumptions, to the choice of an adequate estimator.