Information-Theoretic Causal Bounds under Unmeasured Confounding

📅 2026-01-23

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work proposes a data-driven information-theoretic framework for sharp partial identification of conditional causal effects in the presence of unobserved confounding. The approach relies solely on an upper bound on the f-divergence between the propensity scores under treatment and control, without requiring instrumental variables, sensitivity parameters, or structural modeling assumptions. By integrating f-divergence constraints, propensity score modeling, Neyman-orthogonal semiparametric estimators, and flexible machine learning methods, the framework overcomes key limitations in existing literature—such as restrictions to discrete or bounded outcomes, dependence on parametric structural models, and the inability to account for covariate-specific conditional effects—within a unified setting. Simulations and real-data analyses demonstrate that the method yields tight and valid bounds on causal effects across diverse data-generating mechanisms.

Technology Category

Application Category

📝 Abstract

We develop a data-driven information-theoretic framework for sharp partial identification of causal effects under unmeasured confounding. Existing approaches often rely on restrictive assumptions, such as bounded or discrete outcomes; require external inputs (for example, instrumental variables, proxies, or user-specified sensitivity parameters); necessitate full structural causal model specifications; or focus solely on population-level averages while neglecting covariate-conditional treatment effects. We overcome all four limitations simultaneously by establishing novel information-theoretic, data-driven divergence bounds. Our key theoretical contribution shows that the f-divergence between the observational distribution P(Y | A = a, X = x) and the interventional distribution P(Y | do(A = a), X = x) is upper bounded by a function of the propensity score alone. This result enables sharp partial identification of conditional causal effects directly from observational data, without requiring external sensitivity parameters, auxiliary variables, full structural specifications, or outcome boundedness assumptions. For practical implementation, we develop a semiparametric estimator satisfying Neyman orthogonality (Chernozhukov et al., 2018), which ensures square-root-n consistent inference even when nuisance functions are estimated using flexible machine learning methods. Simulation studies and real-world data applications, implemented in the GitHub repository (https://github.com/yonghanjung/Information-Theretic-Bounds), demonstrate that our framework provides tight and valid causal bounds across a wide range of data-generating processes.

Problem

Research questions and friction points this paper is trying to address.

unmeasured confounding

causal inference

partial identification

information theory

conditional treatment effects

Innovation

Methods, ideas, or system contributions that make the work stand out.

information-theoretic bounds

unmeasured confounding

partial identification