Convex Mixed-Integer Programming for Causal Additive Models with Optimization and Statistical Guarantees

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the problem of learning directed acyclic graphs (DAGs) from data generated by nonlinear additive noise models (ANMs) with Gaussian noise. We propose a convex mixed-integer programming method based on basis function expansion and group ℓ₀ regularization, enabling explicit control of edge sparsity and seamless integration of structural prior knowledge. Theoretically, we establish statistical consistency and optimization convergence guarantees, and support early stopping as well as verifiably optimal solutions. Leveraging maximum likelihood estimation, branch-and-bound, and optimality gap analysis, we derive tight statistical error bounds. Experiments demonstrate that our approach significantly outperforms state-of-the-art DAG learning algorithms on both synthetic and real-world high-dimensional datasets. To the best of our knowledge, this is the first method to achieve consistent graph structure recovery under nonlinear ANMs while providing verifiable optimality within a user-specified precision.

Technology Category

Application Category

📝 Abstract
We study the problem of learning a directed acyclic graph from data generated according to an additive, non-linear structural equation model with Gaussian noise. We express each non-linear function through a basis expansion, and derive a maximum likelihood estimator with a group l0-regularization that penalizes the number of edges in the graph. The resulting estimator is formulated through a convex mixed-integer program, enabling the use of branch-and-bound methods to obtain a solution that is guaranteed to be accurate up to a pre-specified optimality gap. Our formulation can naturally encode background knowledge, such as the presence or absence of edges and partial order constraints among the variables. We establish consistency guarantees for our estimator in terms of graph recovery, even when the number of variables grows with the sample size. Additionally, by connecting the optimality guarantees with our statistical error bounds, we derive an early stopping criterion that allows terminating the branch-and-bound procedure while preserving consistency. Compared with existing approaches that either assume equal error variances, restrict to linear structural equation models, or rely on heuristic procedures, our method enjoys both optimization and statistical guarantees. Extensive simulations and real-data analysis show that the proposed method achieves markedly better graph recovery performance.
Problem

Research questions and friction points this paper is trying to address.

Learning directed acyclic graphs from additive nonlinear structural equation models
Formulating maximum likelihood estimation via convex mixed-integer programming
Establishing optimization and statistical guarantees for graph recovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

Convex mixed-integer programming for causal discovery
Group l0-regularization for sparse graph structure
Branch-and-bound with statistical error bounds guarantees
🔎 Similar Papers
No similar papers found.
X
Xiaozhu Zhang
Department of Statistics, University of Washington
N
Nir Keret
Department of Biostatistics, University of Washington
Ali Shojaie
Ali Shojaie
Professor, University of Washington
statisticsbiostatisticsmachine learningnetwork analysis
Armeen Taeb
Armeen Taeb
University of Washington
statisticsoptimization