🤖 AI Summary
Existing differentiable causal discovery methods rely on soft acyclicity constraints, which often yield invalid graphs, suffer from numerical instability, and struggle to scale. This work proposes PACER, a framework that jointly models variable orderings and edge probabilities to directly optimize within the space of valid directed acyclic graphs (DAGs) without requiring penalty terms, while unifying the treatment of observational and interventional data. PACER is the first permutation-based approach enabling scalable causal discovery, supporting flexible conditional density models and seamless integration of prior knowledge. In the linear Gaussian setting, it provides closed-form likelihood gradients. Experiments demonstrate that PACER matches or exceeds state-of-the-art performance on protein signaling and gene perturbation benchmarks, scales efficiently to networks with thousands of variables, and achieves up to two orders of magnitude speedup over existing differentiable methods.
📝 Abstract
Inferring the structure of directed acyclic graphs (DAGs) from data is a central challenge in causal discovery, particularly in modern high-dimensional settings where large-scale interventional data are increasingly available. While interventional data can improve identifiability, existing methods remain limited by soft acyclicity constraints, leading to optimization over invalid cyclic graphs, numerical instability, and reduced scalability. We introduce PACER (Perturbation-driven Acyclic Causal Edge Recovery), a scalable framework for causal discovery that guarantees acyclicity by construction. PACER parameterizes a distribution over DAGs through a joint model of variable permutations and edge probabilities, enabling direct optimization over valid causal structures without surrogate penalties. The framework supports a unified likelihood-based treatment of observational and interventional data, flexible conditional density models, and the incorporation of structural prior knowledge. For linear-Gaussian mechanisms, we derive closed-form expressions for the expected interventional log-likelihood and its gradients, yielding substantial computational gains. Empirically, PACER matches or exceeds state-of-the-art methods on protein signaling and large-scale genetic perturbation benchmarks, while scaling efficiently to networks with thousands of variables and achieving up to two orders of magnitude speedups over penalty-based differentiable approaches. These results demonstrate that exact and scalable causal discovery from high-dimensional perturbation data is achievable through principled search space design.