Causal Inference with Categorical Unobserved Confounder via Mixture Learning

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the challenge of identifying causal effects in the presence of categorical unobserved confounders by proposing a novel mixture-learning approach. The method models the unobserved confounding structure as a discrete mixture distribution and, leveraging proxy variables or multiple treatment settings, establishes the first rigorous identifiability result for causal effects under such conditions. By employing tensor decomposition techniques, the approach consistently recovers the latent confounding structure from observational data and yields an estimator with non-asymptotic error guarantees. Both theoretical analysis and empirical evaluations demonstrate that the proposed method accurately estimates causal effects even with finite samples, achieving superior performance on synthetic and real-world datasets.

📝 Abstract

Unobserved confounding is a fundamental challenge for estimating causal effects. To address unobserved confounding, recent literature has turned to two different approaches -- proxy variables and the use of multiple treatments. The first approach, commonly referred to as proximal causal inference, requires proxies to be assigned to specific asymmetric roles: treatment-inducing proxies (negative control exposures), variables that act as common causes of the treatment and outcome, and outcome-inducing proxies (negative control outcomes). In practice, however, identifying variables that satisfy these asymmetric roles can be difficult depending on the application domain. The second approach, commonly referred to as the ``Deconfounder," deals with multiple conditionally independent treatments. There has been limited progress towards developing a consistent estimation method for this setting. As the primary contribution of this work, we establish that causal effects are identifiable in both settings when the unobserved confounder is categorical under suitable conditions. Our approach builds on a mixture learning perspective: we show that the underlying confounding structure can be recovered by identifying the corresponding mixture distribution. We propose an estimation procedure based on tensor decomposition, which allows consistent recovery of the latent structure and comes with non-asymptotic guarantees. Simulation studies and real data experiments demonstrate that the proposed method performs well even with limited data.

Problem

Research questions and friction points this paper is trying to address.

unobserved confounding

causal inference

categorical confounder

proxy variables

multiple treatments

Innovation

Methods, ideas, or system contributions that make the work stand out.

mixture learning

tensor decomposition

unobserved confounding