Improving the Efficiency of Subgroup Analysis in Randomized Controlled Trials with TMLE

📅 2026-05-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

197K/year
🤖 AI Summary
This study addresses the low statistical power in estimating treatment effects for sparse subgroups within randomized controlled trials due to limited sample sizes. The authors propose two novel, data-adaptive targeted maximum likelihood estimation (TMLE) approaches—TMLE-PR and A-TMLE—that leverage information from non-target subgroups within the same trial to improve precision in the target subgroup, without requiring external data. These methods enable cross-subgroup information sharing while preserving internal validity. Applied to the LEADER cardiovascular trial, A-TMLE detected a statistically significant 1.5–2.1 percentage point reduction in major adverse cardiovascular events (MACE) among Black and Asian participants, each comprising less than 10% of the trial population, with 95% confidence intervals excluding zero. These findings offer robust statistical support for precision medicine and equitable drug labeling.
📝 Abstract
Subgroup analyses within randomized controlled trials are often underpowered due to limited sample sizes. We address this challenge by leveraging trial participants outside the subgroup of interest to augment estimation within the subgroup. Specifically, we study two Targeted Maximum Likelihood Estimators (TMLEs) that borrow information from non-subgroup participants within the same trial: a TMLE with pooled regression (TMLE-PR) and an Adaptive Targeted Maximum Likelihood Estimator (A-TMLE). Both estimators enable information sharing without relying on any external real-world data, thereby capitalizing on key strengths of the trial: most importantly, the protection against bias afforded by the randomized treatment, but also harmonized data collection, and consistent treatment and outcome definitions. The general strategy proposed here directly advances the priorities of key regulatory agencies, including the FDA, by improving the precision of subgroup-specific treatment effect estimates without introducing external sources of bias, thereby facilitating rigorous inference to support equitable labeling, access, and post-market evaluation. In a case study based on analysis of data from a cardiovascular outcome trial (LEADER, NCT01179048), we estimate the risk reduction of major adverse cardiac events (MACE) under liraglutide treatment among Black and Asian subgroups -- each comprising less than 10\% of the trial population -- using the proposed estimators that borrow information from the remainder of the trial. Using A-TMLE, in particular, we find estimated absolute MACE risk reductions of 1.6, 1.5, and 1.5 percentage points among Asian participants and 2.1, 2.0, and 2.1 percentage points among Black participants at 365, 540, and 730 days, respectively, with 95\% confidence intervals excluding the null at each time point.
Problem

Research questions and friction points this paper is trying to address.

subgroup analysis
randomized controlled trials
statistical power
treatment effect estimation
small sample size
Innovation

Methods, ideas, or system contributions that make the work stand out.

Targeted Maximum Likelihood Estimation
Subgroup Analysis
Randomized Controlled Trials
Information Borrowing
Treatment Effect Estimation
S
Sky Qiu
Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA; Center for Targeted Machine Learning and Causal Inference, School of Public Health, University of California, Berkeley, CA, USA
N
Nerissa Nance
Center for Targeted Machine Learning and Causal Inference, School of Public Health, University of California, Berkeley, CA, USA; Novo Nordisk A/S, Bagsvaerd, Denmark
R
Rachael Phillips
Center for Targeted Machine Learning and Causal Inference, School of Public Health, University of California, Berkeley, CA, USA
J
Jens Tarp
Novo Nordisk A/S, Bagsvaerd, Denmark
M
Maya Petersen
Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA; Center for Targeted Machine Learning and Causal Inference, School of Public Health, University of California, Berkeley, CA, USA
Mark van der Laan
Mark van der Laan
Jiann-Ping Hsu/Karl E. Peace Professor of Biostatistics & Statistics, University of California Berkeley
StatisticsBiostatisticsCausal InferenceMachine LearningComputational Biology