Consistent DAG selection for Bayesian causal discovery under general error distributions

📅 2025-08-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses causal structure identification—specifically, recovering the underlying directed acyclic graph (DAG) of a linear structural equation model (SEM)—from observational data, under general assumptions: error terms are independent and non-Gaussian, with no prespecified parametric form. Method: We propose a Bayesian hierarchical framework featuring a novel non-standard DAG prior and scale-mixture Gaussian error modeling to flexibly capture non-Gaussianity; posterior inference is performed via MCMC. Contribution/Results: We establish, for the first time, posterior DAG selection consistency under generic independent non-Gaussian errors, rigorously characterizing the sharp identifiability boundary for causal recovery. We prove that the posterior probability of the Markov equivalence class containing the true DAG converges to one as sample size increases. Extensive simulations and real-data experiments demonstrate both empirical effectiveness and asymptotic reliability of the method.

Technology Category

Application Category

📝 Abstract
We consider the problem of learning the underlying causal structure among a set of variables, which are assumed to follow a Bayesian network or, more specifically, a linear recursive structural equation model (SEM) with the associated errors being independent and allowed to be non-Gaussian. A Bayesian hierarchical model is proposed to identify the true data-generating directed acyclic graph (DAG) structure where the nodes and edges represent the variables and the direct causal effects, respectively. Moreover, incorporating the information of non-Gaussian errors, we characterize the distribution equivalence class of the true DAG, which specifies the best possible extent to which the DAG can be identified based on purely observational data. Furthermore, under the consideration that the errors are distributed as some scale mixture of Gaussian, where the mixing distribution is unspecified, and mild distributional assumptions, we establish that by employing a non-standard DAG prior, the posterior probability of the distribution equivalence class of the true DAG converges to unity as the sample size grows. This shows that the proposed method achieves the posterior DAG selection consistency, which is further illustrated with examples and simulation studies.
Problem

Research questions and friction points this paper is trying to address.

Learning causal structure in linear recursive SEMs with non-Gaussian errors
Identifying true DAG structure using Bayesian hierarchical model
Achieving posterior DAG selection consistency under scale mixture errors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian hierarchical model for DAG identification
Non-Gaussian error distribution characterization
Non-standard DAG prior for consistency
🔎 Similar Papers
No similar papers found.
A
Anamitra Chaudhuri
Department of Statistics, Texas A&M University
Anirban Bhattacharya
Anirban Bhattacharya
Professor, Texas A&M University
Bayesian statisticsHigh-dimensional dataNonparametrics
Y
Yang Ni
Department of Statistics, Texas A&M University