🤖 AI Summary
This paper addresses the structural identifiability problem for partially observable linear causal systems with latent variables. We propose the first score-based greedy search method with theoretical identifiability guarantees. Our approach introduces: (1) a generalized *N*-factor model and a global consistency theory, ensuring unique identifiability of latent-variable causal graphs; (2) the Latent-variable Greedy Equivalence Search (LGES) algorithm, which exactly recovers the true causal structure within its Markov equivalence class; and (3) a constrained equivalence-class search mechanism coupled with explicit latent-graph operators to enhance both computational efficiency and robustness. Experiments on synthetic and real-world datasets demonstrate that LGES significantly outperforms existing methods in accurately recovering latent-variable causal graphs, while providing rigorous theoretical guarantees on identifiability and consistency.
📝 Abstract
Identifying the structure of a partially observed causal system is essential to various scientific fields. Recent advances have focused on constraint-based causal discovery to solve this problem, and yet in practice these methods often face challenges related to multiple testing and error propagation. These issues could be mitigated by a score-based method and thus it has raised great attention whether there exists a score-based greedy search method that can handle the partially observed scenario. In this work, we propose the first score-based greedy search method for the identification of structure involving latent variables with identifiability guarantees. Specifically, we propose Generalized N Factor Model and establish the global consistency:
the true structure including latent variables can be identified up to the Markov equivalence class by using score. We then design
Latent variable Greedy Equivalence Search (LGES), a greedy search algorithm for this class of model with well-defined operators,
which search very efficiently over the graph space to find the optimal structure. Our experiments on both synthetic and real-life data validate the effectiveness of our method (code will be publicly available).