🤖 AI Summary
This study addresses the challenge of constructing confidence intervals for the average treatment effect (ATE) in multi-center observational healthcare data (e.g., heterogeneous electronic health records across hospitals). We propose a prediction-driven shrinkage inference method that enables valid ATE estimation and interval construction across disparate data sources under weak assumptions. Theoretically, we establish its unbiasedness and asymptotic interval validity for the first time, and extend it to hybrid experimental–observational settings. By integrating bias correction with variance shrinkage, our approach substantially narrows confidence intervals while improving uncertainty quantification accuracy. It guarantees nominal coverage probability under mild conditions. Numerical experiments demonstrate superior performance over naive data pooling. This work provides a statistically rigorous and practically applicable tool for real-world, multi-center evaluation of drug efficacy and safety.
📝 Abstract
Constructing confidence intervals (CIs) for the average treatment effect (ATE) from patient records is crucial to assess the effectiveness and safety of drugs. However, patient records typically come from different hospitals, thus raising the question of how multiple observational datasets can be effectively combined for this purpose. In our paper, we propose a new method that estimates the ATE from multiple observational datasets and provides valid CIs. Our method makes little assumptions about the observational datasets and is thus widely applicable in medical practice. The key idea of our method is that we leverage prediction-powered inferences and thereby essentially `shrink' the CIs so that we offer more precise uncertainty quantification as compared to na""ive approaches. We further prove the unbiasedness of our method and the validity of our CIs. We confirm our theoretical results through various numerical experiments. Finally, we provide an extension of our method for constructing CIs from combinations of experimental and observational datasets.