Ordering-based Causal Discovery via Generalized Score Matching

📅 2026-01-22

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This work addresses the long-standing challenge of learning causal directed acyclic graph (DAG) structures over discrete variables from purely observational data, which has been hindered by the lack of effective scoring functions. The authors propose a novel approach that extends score-matching-based causal ordering identification to discrete settings by introducing a discrete score function–driven leaf-node criterion. Their method first recovers a valid topological order and then prunes spurious edges, enabling efficient reconstruction of the underlying causal graph. Empirical evaluations on both synthetic and real-world datasets demonstrate substantial improvements in causal order inference accuracy. Furthermore, when used as a preprocessing module, the proposed method consistently enhances the performance of multiple existing causal discovery algorithms, highlighting its practical utility and compatibility with current frameworks.

Technology Category

Application Category

📝 Abstract

Learning DAG structures from purely observational data remains a long-standing challenge across scientific domains. An emerging line of research leverages the score of the data distribution to initially identify a topological order of the underlying DAG via leaf node detection and subsequently performs edge pruning for graph recovery. This paper extends the score matching framework for causal discovery, which is originally designated for continuous data, and introduces a novel leaf discriminant criterion based on the discrete score function. Through simulated and real-world experiments, we demonstrate that our theory enables accurate inference of true causal orders from observed discrete data and the identified ordering can significantly boost the accuracy of existing causal discovery baselines on nearly all of the settings.

Problem

Research questions and friction points this paper is trying to address.

causal discovery

discrete data

topological order

DAG learning

observational data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalized Score Matching

Discrete Score Function

Leaf Node Detection