Data Fusion for Partial Identification of Causal Effects

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In causal inference, integrating observational and experimental data often entails simultaneous violations of the no-unmeasured-confounding and cross-source counterfactual exchangeability assumptions—rendering causal effects nonidentifiable under conventional methods. To address this, we propose the first partial identification framework tailored to dual assumption failure: (i) introducing interpretable sensitivity parameters to quantify the degree of assumption violation; (ii) deriving tight bounds on causal effects; and (iii) pioneering *collapse boundary analysis*—a hierarchical assessment of both effect direction (positive/negative) and conclusion robustness. Our method unifies double-robust estimation, semiparametric bound inference, and sensitivity modeling into an end-to-end computational framework. Empirical validation on the Project STAR dataset demonstrates that the positive effect of class size reduction on student achievement is robust across the overall population and multiple subgroups, even under substantial joint influence of unmeasured confounding and cross-source heterogeneity.

Technology Category

Application Category

📝 Abstract
Data fusion techniques integrate information from heterogeneous data sources to improve learning, generalization, and decision making across data sciences. In causal inference, these methods leverage rich observational data to improve causal effect estimation, while maintaining the trustworthiness of randomized controlled trials. Existing approaches often relax the strong no unobserved confounding assumption by instead assuming exchangeability of counterfactual outcomes across data sources. However, when both assumptions simultaneously fail - a common scenario in practice - current methods cannot identify or estimate causal effects. We address this limitation by proposing a novel partial identification framework that enables researchers to answer key questions such as: Is the causal effect positive or negative? and How severe must assumption violations be to overturn this conclusion? Our approach introduces interpretable sensitivity parameters that quantify assumption violations and derives corresponding causal effect bounds. We develop doubly robust estimators for these bounds and operationalize breakdown frontier analysis to understand how causal conclusions change as assumption violations increase. We apply our framework to the Project STAR study, which investigates the effect of classroom size on students' third-grade standardized test performance. Our analysis reveals that the Project STAR results are robust to simultaneous violations of key assumptions, both on average and across various subgroups of interest. This strengthens confidence in the study's conclusions despite potential unmeasured biases in the data.
Problem

Research questions and friction points this paper is trying to address.

Addresses causal effect identification when key assumptions fail
Proposes interpretable sensitivity parameters for assumption violations
Evaluates robustness of causal conclusions in real-world studies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel partial identification framework for causal effects
Interpretable sensitivity parameters for assumption violations
Doubly robust estimators for causal effect bounds
🔎 Similar Papers
No similar papers found.