🤖 AI Summary
Causal discovery aims to infer causal structures among variables from observational data, forming the foundation for AI-driven decision-making and intervention. Existing constraint-based methods (e.g., PC) rely on conditional independence tests but suffer from poor statistical power in small-sample regimes; score-based approaches (e.g., NOTEARS) enable differentiable optimization yet neglect explicit independence constraints. This paper proposes the first differentiable d-separation scoring framework integrating soft logic and percolation theory: conditional independencies are encoded as soft logical formulas; a continuous, differentiable d-separation metric is constructed via percolation-theoretic principles; and the resulting objective is embedded into a gradient-based end-to-end learning framework for causal graph estimation. Our method unifies the statistical rigor of constraint-based methods with the optimization flexibility of score-based ones. It achieves significant improvements over state-of-the-art baselines under limited samples and establishes new SOTA performance across multiple real-world benchmarks. Code and data are publicly available.
📝 Abstract
Causal discovery from observational data is a fundamental task in artificial intelligence, with far-reaching implications for decision-making, predictions, and interventions. Despite significant advances, existing methods can be broadly categorized as constraint-based or score-based approaches. Constraint-based methods offer rigorous causal discovery but are often hindered by small sample sizes, while score-based methods provide flexible optimization but typically forgo explicit conditional independence testing. This work explores a third avenue: developing differentiable $d$-separation scores, obtained through a percolation theory using soft logic. This enables the implementation of a new type of causal discovery method: gradient-based optimization of conditional independence constraints. Empirical evaluations demonstrate the robust performance of our approach in low-sample regimes, surpassing traditional constraint-based and score-based baselines on a real-world dataset. Code and data of the proposed method are publicly available at https://github$.$com/PurdueMINDS/DAGPA.