🤖 AI Summary
Estimating the average treatment effect (ATE) in the presence of selection bias within subpopulations is highly susceptible to severe bias. This work establishes necessary and sufficient conditions for ATE identifiability by introducing weak assumptions on both the propensity score and the selection probability, integrating causal graphical models with probabilistic constraints. These conditions are substantially weaker than those required by existing approaches, thereby extending identifiability criteria under graphical models and providing a more general and rigorous theoretical foundation for causal effect identification in settings affected by selection bias. Consequently, the proposed framework enhances both the applicability and accuracy of ATE estimation in such scenarios.
📝 Abstract
Selection bias is pervasive in observational studies. For example, large scale biobanks data can exhibit ``healthy volunteer bias'' when respondents are healthier and of higher socio-economic status than the population they are meant to represent. Recovering causal effects from such sub-population is an important problem in causal inference, as estimating average treatment effects (ATE) from selected populations can result in a severely biased estimate of the ATE from the whole population.
In this paper, we investigate the identifiability of the ATE under selection bias. We provide necessary and sufficient conditions for ATE identifiability, leveraging weak assumptions on probability classes to characterize propensity score and selection probability. Compared to previous works, our results extend existing graphical identifiability criteria and offer a more comprehensive understanding of causal effect identification with strictly weaker conditions in the presence of selection bias.