🤖 AI Summary
This work addresses the limitations of traditional constraint-based causal discovery methods in the presence of latent variables, which are prone to issues such as dependence on conditional independence test ordering, error propagation, and sensitivity to significance level selection. The authors propose a score-based structural learning framework that, for the first time, establishes identifiability-guaranteed score equivalence and consistency theory for causal models with latent variables. This framework unifies the characterization of the degrees of freedom in the marginal distributions of observed variables under diverse structural assumptions. By extending the Greedy Equivalence Search algorithm and integrating continuous optimization techniques, the method yields an efficient and accurate scoring procedure. Empirical evaluations demonstrate that the proposed approach significantly outperforms existing constraint-based methods in recovering causal structures when latent variables are present.
📝 Abstract
Identifying latent variables and the causal structure involving them is essential across various scientific fields. While many existing works fall under the category of constraint-based methods (with e.g. conditional independence or rank deficiency tests), they may face empirical challenges such as testing-order dependency, error propagation, and choosing an appropriate significance level. These issues can potentially be mitigated by properly designed score-based methods, such as Greedy Equivalence Search (GES) (Chickering, 2002) in the specific setting without latent variables. Yet, formulating score-based methods with latent variables is highly challenging. In this work, we develop score-based methods that are capable of identifying causal structures containing causally-related latent variables with identifiability guarantees. Specifically, we show that a properly formulated scoring function can achieve score equivalence and consistency for structure learning of latent variable causal models. We further provide a characterization of the degrees of freedom for the marginal over the observed variables under multiple structural assumptions considered in the literature, and accordingly develop both exact and continuous score-based methods. This offers a unified view of several existing constraint-based methods with different structural assumptions. Experimental results validate the effectiveness of the proposed methods.