Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Variable selection heuristics in branch-and-bound (B&B) for mixed-integer linear programming (MILP) suffer from low efficiency and poor generalization. Method: We propose the first model-based reinforcement learning (MBRL) framework for B&B, which learns a dynamic environment model of the B&B search process and integrates Monte Carlo tree search (MCTS) for forward-looking, adaptive branching decisions—yielding both interpretability and sample efficiency. Contribution/Results: Our approach overcomes the dual limitations of static heuristics and model-free RL in modeling capacity and data efficiency. Evaluated on four standard MILP benchmarks, it consistently outperforms state-of-the-art RL-driven branching policies, achieving significant reductions in solving time, number of explored nodes, and optimality gap. These results validate the effectiveness and scalability of model-guided planning for combinatorial optimization.

Technology Category

Application Category

📝 Abstract

Mixed-Integer Linear Programming (MILP) lies at the core of many real-world combinatorial optimization (CO) problems, traditionally solved by branch-and-bound (B&B). A key driver influencing B&B solvers efficiency is the variable selection heuristic that guides branching decisions. Looking to move beyond static, hand-crafted heuristics, recent work has explored adapting traditional reinforcement learning (RL) algorithms to the B&B setting, aiming to learn branching strategies tailored to specific MILP distributions. In parallel, RL agents have achieved remarkable success in board games, a very specific type of combinatorial problems, by leveraging environment simulators to plan via Monte Carlo Tree Search (MCTS). Building on these developments, we introduce Plan-and-Branch-and-Bound (PlanB&B), a model-based reinforcement learning (MBRL) agent that leverages a learned internal model of the B&B dynamics to discover improved branching strategies. Computational experiments empirically validate our approach, with our MBRL branching agent outperforming previous state-of-the-art RL methods across four standard MILP benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Learns branching strategies for Mixed-Integer Linear Programming

Uses model-based reinforcement learning for combinatorial optimization

Improves variable selection heuristics in branch-and-bound algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-based RL agent for branching strategies

Learned internal model of B&B dynamics

Outperforms state-of-the-art RL methods

🔎 Similar Papers

What Matters in Hierarchical Search for Combinatorial Reasoning Problems?