Convex Mixed-Integer Programming for Causal Additive Models with Optimization and Statistical Guarantees

📅 2025-11-26

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This paper addresses the problem of learning directed acyclic graphs (DAGs) from data generated by nonlinear additive noise models (ANMs) with Gaussian noise. We propose a convex mixed-integer programming method based on basis function expansion and group ℓ₀ regularization, enabling explicit control of edge sparsity and seamless integration of structural prior knowledge. Theoretically, we establish statistical consistency and optimization convergence guarantees, and support early stopping as well as verifiably optimal solutions. Leveraging maximum likelihood estimation, branch-and-bound, and optimality gap analysis, we derive tight statistical error bounds. Experiments demonstrate that our approach significantly outperforms state-of-the-art DAG learning algorithms on both synthetic and real-world high-dimensional datasets. To the best of our knowledge, this is the first method to achieve consistent graph structure recovery under nonlinear ANMs while providing verifiable optimality within a user-specified precision.

Technology Category

Application Category

📝 Abstract

We study the problem of learning a directed acyclic graph from data generated according to an additive, non-linear structural equation model with Gaussian noise. We express each non-linear function through a basis expansion, and derive a maximum likelihood estimator with a group l0-regularization that penalizes the number of edges in the graph. The resulting estimator is formulated through a convex mixed-integer program, enabling the use of branch-and-bound methods to obtain a solution that is guaranteed to be accurate up to a pre-specified optimality gap. Our formulation can naturally encode background knowledge, such as the presence or absence of edges and partial order constraints among the variables. We establish consistency guarantees for our estimator in terms of graph recovery, even when the number of variables grows with the sample size. Additionally, by connecting the optimality guarantees with our statistical error bounds, we derive an early stopping criterion that allows terminating the branch-and-bound procedure while preserving consistency. Compared with existing approaches that either assume equal error variances, restrict to linear structural equation models, or rely on heuristic procedures, our method enjoys both optimization and statistical guarantees. Extensive simulations and real-data analysis show that the proposed method achieves markedly better graph recovery performance.

Problem

Research questions and friction points this paper is trying to address.

Learning directed acyclic graphs from additive nonlinear structural equation models

Formulating maximum likelihood estimation via convex mixed-integer programming

Establishing optimization and statistical guarantees for graph recovery

Innovation

Methods, ideas, or system contributions that make the work stand out.

Convex mixed-integer programming for causal discovery

Group l0-regularization for sparse graph structure

Branch-and-bound with statistical error bounds guarantees

🔎 Similar Papers

No similar papers found.