Causal Discovery on Dependent Binary Data

📅 2024-12-28

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

Conventional causal discovery methods fail on binary data with dependencies among observational units, as they assume independent and identically distributed (i.i.d.) samples—a condition violated in many real-world settings. Method: We propose the first causal graph learning framework tailored to dependent binary data. It models correlated errors via a latent utility framework, and introduces a pairwise maximum-likelihood estimator coupled with an EM-style iterative decorrelation algorithm that exploits covariance structure to achieve sample decorrelation. The resulting decorrelated samples can be directly fed into standard causal discovery algorithms (e.g., PC, GES). Contribution/Results: The framework ensures statistical identifiability while maintaining computational tractability. Extensive experiments on synthetic and real-world datasets demonstrate substantial improvements in causal graph structure recovery accuracy. Our systematic evaluation confirms that decorrelation preprocessing is critical—and highly beneficial—for causal discovery from dependent binary data.

Technology Category

Application Category

📝 Abstract

The assumption of independence between observations (units) in a dataset is prevalent across various methodologies for learning causal graphical models. However, this assumption often finds itself in conflict with real-world data, posing challenges to accurate structure learning. We propose a decorrelation-based approach for causal graph learning on dependent binary data, where the local conditional distribution is defined by a latent utility model with dependent errors across units. We develop a pairwise maximum likelihood method to estimate the covariance matrix for the dependence among the units. Then, leveraging the estimated covariance matrix, we develop an EM-like iterative algorithm to generate and decorrelate samples of the latent utility variables, which serve as decorrelated data. Any standard causal discovery method can be applied on the decorrelated data to learn the underlying causal graph. We demonstrate that the proposed decorrelation approach significantly improves the accuracy in causal graph learning, through numerical experiments on both synthetic and real-world datasets.

Problem

Research questions and friction points this paper is trying to address.

Causal Inference

Dependent Observations

Structural Causal Models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Binary Causal Inference

Iterative Correction Algorithm

Dependent Observations Modeling

🔎 Similar Papers

No similar papers found.