Internal Incoherency Scores for Constraint-based Causal Discovery Algorithms

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge in constraint-based causal discovery (e.g., the PC algorithm) that its core assumptions are difficult to verify and finite-sample effects often induce erroneous causal structures. We propose an internal consistency scoring mechanism that requires neither ground-truth causal graphs nor additional statistical tests. Our method leverages logical inference from conditional independence tests, propagation of graph-structural constraints, and detection of consistency conflicts. It is the first to systematically classify errors into detectable and undetectable categories and provides a rigorous proof that the proposed score quantifies all detectable errors. Evaluated on both synthetic and real-world datasets, the method effectively identifies misoriented and missing edges in PC-learned graphs, significantly enhancing the reliability and interpretability of inferred causal structures. This work fills a critical gap in self-assessment and credibility evaluation of causal discovery outputs.

Technology Category

Application Category

📝 Abstract
Causal discovery aims to infer causal graphs from observational or experimental data. Methods such as the popular PC algorithm are based on conditional independence testing and utilize enabling assumptions, such as the faithfulness assumption, for their inferences. In practice, these assumptions, as well as the functional assumptions inherited from the chosen conditional independence test, are typically taken as a given and not further tested for their validity on the data. In this work, we propose internal coherency scores that allow testing for assumption violations and finite sample errors, whenever detectable without requiring ground truth or further statistical tests. We provide a complete classification of erroneous results, including a distinction between detectable and undetectable errors, and prove that the detectable erroneous results can be measured by our scores. We illustrate our coherency scores on the PC algorithm with simulated and real-world datasets, and envision that testing for internal coherency can become a standard tool in applying constraint-based methods, much like a suite of tests is used to validate the assumptions of classical regression analysis.
Problem

Research questions and friction points this paper is trying to address.

Testing assumption violations in causal discovery
Detecting finite sample errors without ground truth
Classifying erroneous results in constraint-based algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Internal coherency scores introduced
Detect assumption violations efficiently
Classify erroneous results comprehensively