🤖 AI Summary
Causal extrapolation of conditional average treatment effects (CATE) to target populations is biased under unmeasured confounding. Method: We propose a causal extrapolation framework integrating multiple randomized controlled trials (RCTs) and observational data. Our approach constructs a nonlinear deconfounding function from multiple related RCTs to automatically correct confounding bias in observational data and extend CATE estimation beyond the support region. Contribution/Results: By leveraging auxiliary RCTs, our method overcomes limitations of single-RCT approaches—namely, insufficient sample size and poor covariate coverage—thereby enabling robust extrapolation, especially under nonlinear confounding structures. Empirical results demonstrate that incorporating even one additional RCT significantly improves CATE estimation accuracy, with pronounced gains in low-sample-size and strongly nonlinear settings. The framework provides a scalable, robust solution for synthesizing real-world evidence in causal inference.
📝 Abstract
While randomised controlled trials (RCTs) are the gold standard for estimating causal treatment effects, their limited sample sizes and restrictive criteria make it difficult to extrapolate to a broader population. Observational data, while larger, suffer from unmeasured confounding. Therefore, we can combine the strengths of both data sources for more accurate results.
This work extends existing methods that use RCTs to debias conditional average treatment effects (CATEs) estimated in observational data by defining a deconfounding function. Our proposed approach borrows information from RCTs of multiple related treatments to improve the extrapolation of CATEs.
Simulation results showed that, for non-linear deconfounding functions, using only one RCT poorly estimates the CATE outside of the support of that RCT. This is emphasised for smaller RCTs. Borrowing information from a second RCT provided more accurate estimates of the CATE outside of the support of both RCTs.