🤖 AI Summary
This work addresses the challenge of solving large-scale mixed-integer linear programs (MILPs), which remains difficult due to high computational complexity. Existing graph neural network–based approaches are limited by their local receptive fields and struggle to capture global problem structure. To overcome this, we propose a general-purpose neural backbone that, for the first time, integrates a dual attention mechanism into MILP representation learning. Specifically, self-attention and cross-attention operate in parallel on the variable-constraint bipartite graph, enabling global information exchange and deep feature representation. The resulting architecture supports diverse downstream tasks—including instance-level, element-level, and solver-state-level predictions—and consistently outperforms state-of-the-art methods across multiple standard benchmarks, significantly improving both solving efficiency and generalization capability.
📝 Abstract
Mixed-integer linear programming (MILP), a widely used modeling framework for combinatorial optimization, are central to many scientific and engineering applications, yet remains computationally challenging at scale. Recent advances in deep learning address this challenge by representing MILP instances as variable-constraint bipartite graphs and applying graph neural networks (GNNs) to extract latent structural patterns and enhance solver efficiency. However, this architecture is inherently limited by the local-oriented mechanism, leading to restricted representation power and hindering neural approaches for MILP. Here we present an attention-driven neural architecture that learns expressive representations beyond the pure graph view. A dual-attention mechanism is designed to perform parallel self- and cross-attention over variables and constraints, enabling global information exchange and deeper representation learning. We apply this general backbone to various downstream tasks at the instance level, element level, and solving state level. Extensive experiments across widely used benchmarks show consistent improvements of our approach over state-of-the-art baselines, highlighting attention-based neural architectures as a powerful foundation for learning-enhanced mixed-integer linear optimization.