🤖 AI Summary
Integer programming (IP) approaches to Bayesian network structure learning (BNSL) suffer from computational bottlenecks in the pricing problem due to exponential growth in variables and constraints. Method: We propose a dynamic optimization framework based on row/column generation, the first to formulate the BNSL pricing problem as a difference-submodular optimization task and solve it efficiently via an inexact Difference of Convex Algorithm (DCA), circumventing the complexity limitations of exact pricing. Our method integrates ℓ₀-penalized likelihood scoring with column generation. Results: On Gaussian continuous data, it significantly improves solution quality—particularly for high-density graphs—outperforming mainstream scoring methods; on large-scale graphs, it matches state-of-the-art constrained and hybrid approaches. This work establishes a new scalable paradigm for BNSL.
📝 Abstract
In this paper, we consider a score-based Integer Programming (IP) approach for solving the Bayesian Network Structure Learning (BNSL) problem. State-of-the-art BNSL IP formulations suffer from the exponentially large number of variables and constraints. A standard approach in IP to address such challenges is to employ row and column generation techniques, which dynamically generate rows and columns, while the complex pricing problem remains a computational bottleneck for BNSL. For the general class of $ell_0$-penalized likelihood scores, we show how the pricing problem can be reformulated as a difference of submodular optimization problem, and how the Difference of Convex Algorithm (DCA) can be applied as an inexact method to efficiently solve the pricing problems. Empirically, we show that, for continuous Gaussian data, our row and column generation approach yields solutions with higher quality than state-of-the-art score-based approaches, especially when the graph density increases, and achieves comparable performance against benchmark constraint-based and hybrid approaches, even when the graph size increases.