🤖 AI Summary
This study addresses the challenge of inaccurate conditional average treatment effect (CATE) estimation in fine-grained subgroups due to sparse sample sizes in randomized controlled trials. The authors propose a novel James–Stein–type estimator that, for the first time, leverages shrinkage to integrate coarse-grained CATE information from external trials—such as effects stratified by sex or race—to improve CATE estimation precision for fine-grained subgroups in the internal study, while allowing for heterogeneity in marginal treatment effects between internal and external populations. Combining empirical Bayes with generalized ridge regression, the method uniformly dominates ordinary least squares using only internal data under a weighted quadratic loss. Simulations demonstrate superior performance over existing shrinkage approaches, and application to the SURMOUNT-1 trial successfully identifies a significantly stronger weight-loss treatment effect among White women compared to Asian women—a difference undetectable using internal data alone.
📝 Abstract
Randomized controlled trials (RCTs) are often underpowered to detect treatment heterogeneity in subgroups defined by cross-classifications of multiple covariates, due to sparse sample sizes in some strata. External RCT data can help, but typically provide treatment effect estimates at a coarser level (e.g., by sex or race) rather than for the finer subgroups of interest (e.g., race-by-sex). We propose a novel James-Stein (JS)-type estimator that borrows strength from such coarsened external estimates to improve estimation of finer subgroup-specific conditional average treatment effects (CATEs) in an internal study, while accommodating potential incompatibility in marginal CATEs across populations. Based on asymptotic theory, we derive a practical analytic variance estimator for the JS estimator that exhibits acceptable empirical performance. Under mild conditions, we show that the proposed estimator uniformly dominates the ordinary least squares (OLS) estimator based on internal data regarding a weighted quadratic loss. Simulation studies demonstrate favorable performance compared with existing shrinkage methods, including empirical Bayes and generalized ridge estimators. We illustrate our method by estimating race-by-sex subgroup CATEs in a tirzepatide weight-loss trial (SURMOUNT-1), borrowing sex-specific and race-specific estimates from two previous semaglutide trials (STEP 1 and STEP 2). The proposed method detects a significantly larger treatment effect on percentage weight loss in the female-White subgroup than in the female-Asian subgroup, a difference not detected using internal data alone.