Conditional Average Treatment Effect Estimation Under Hidden Confounders

📅 2025-06-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the bias in conditional average treatment effect (CATE) estimation under hidden confounding. We propose a calibration method that does not require access to covariates from randomized controlled trials (RCTs), instead leveraging a small RCT dataset to calibrate potential outcomes in large-scale observational data. Our core innovation is a pseudo-confounder generator that jointly performs adversarial distribution alignment and deep CATE estimation, enabling implicit alignment between the potential outcome spaces of observational and RCT data—even without RCT covariates—thereby relaxing the conventional conditional ignorability assumption. By integrating causal inference with representation learning, our approach significantly reduces CATE estimation error on both synthetic and real-world healthcare datasets. It is especially suitable for privacy-sensitive settings where RCT covariates are unavailable or restricted. Empirical results demonstrate superior robustness and accuracy compared to state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
One of the major challenges in estimating conditional potential outcomes and conditional average treatment effects (CATE) is the presence of hidden confounders. Since testing for hidden confounders cannot be accomplished only with observational data, conditional unconfoundedness is commonly assumed in the literature of CATE estimation. Nevertheless, under this assumption, CATE estimation can be significantly biased due to the effects of unobserved confounders. In this work, we consider the case where in addition to a potentially large observational dataset, a small dataset from a randomized controlled trial (RCT) is available. Notably, we make no assumptions on the existence of any covariate information for the RCT dataset, we only require the outcomes to be observed. We propose a CATE estimation method based on a pseudo-confounder generator and a CATE model that aligns the learned potential outcomes from the observational data with those observed from the RCT. Our method is applicable to many practical scenarios of interest, particularly those where privacy is a concern (e.g., medical applications). Extensive numerical experiments are provided demonstrating the effectiveness of our approach for both synthetic and real-world datasets.
Problem

Research questions and friction points this paper is trying to address.

Estimating CATE with hidden confounders using observational and RCT data
Aligning potential outcomes between observational and RCT datasets
Addressing bias in CATE estimation without covariate assumptions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses RCT and observational data jointly
Employs pseudo-confounder generator technique
Aligns potential outcomes between datasets
🔎 Similar Papers
No similar papers found.
Ahmed Aloui
Ahmed Aloui
Duke University
Machine Learning
J
Juncheng Dong
Department of Electrical and Computer Engineering, Duke University
Ali Hasan
Ali Hasan
Duke University
Vahid Tarokh
Vahid Tarokh
Duke University
Foundations of AI