🤖 AI Summary
This paper addresses the bias in conditional average treatment effect (CATE) estimation under hidden confounding. We propose a calibration method that does not require access to covariates from randomized controlled trials (RCTs), instead leveraging a small RCT dataset to calibrate potential outcomes in large-scale observational data. Our core innovation is a pseudo-confounder generator that jointly performs adversarial distribution alignment and deep CATE estimation, enabling implicit alignment between the potential outcome spaces of observational and RCT data—even without RCT covariates—thereby relaxing the conventional conditional ignorability assumption. By integrating causal inference with representation learning, our approach significantly reduces CATE estimation error on both synthetic and real-world healthcare datasets. It is especially suitable for privacy-sensitive settings where RCT covariates are unavailable or restricted. Empirical results demonstrate superior robustness and accuracy compared to state-of-the-art methods.
📝 Abstract
One of the major challenges in estimating conditional potential outcomes and conditional average treatment effects (CATE) is the presence of hidden confounders. Since testing for hidden confounders cannot be accomplished only with observational data, conditional unconfoundedness is commonly assumed in the literature of CATE estimation. Nevertheless, under this assumption, CATE estimation can be significantly biased due to the effects of unobserved confounders. In this work, we consider the case where in addition to a potentially large observational dataset, a small dataset from a randomized controlled trial (RCT) is available. Notably, we make no assumptions on the existence of any covariate information for the RCT dataset, we only require the outcomes to be observed. We propose a CATE estimation method based on a pseudo-confounder generator and a CATE model that aligns the learned potential outcomes from the observational data with those observed from the RCT. Our method is applicable to many practical scenarios of interest, particularly those where privacy is a concern (e.g., medical applications). Extensive numerical experiments are provided demonstrating the effectiveness of our approach for both synthetic and real-world datasets.