π€ AI Summary
This work addresses the high computational cost and memory bottlenecks in diffusion-based neural combinatorial optimization solvers during inference, which stem from dense edge or factor interactions. To this end, we propose LoReβa training-free, plug-and-play dynamic interaction selection mechanism that operates at inference time. LoRe introduces, for the first time, the concept of dynamic routing from many-body physics into graph neural solvers, adaptively selecting critical interactions at each step based on conflict and uncertainty metrics while enforcing strict budget constraints. Without requiring any retraining, LoRe substantially enhances scalability: on the maximum independent set problem, it enables over 3Γ larger instance sizes, 8Γ faster inference, and a 12Γ reduction in peak memory; on thousand-node TSP instances, it achieves 15Γ speedup and 44Γ less memory usage, all while maintaining competitive solution quality.
π Abstract
Diffusion-based neural solvers for combinatorial optimization repeatedly re-evaluate dense edge/factor interactions, making inference expensive in wall-clock time and often memory-bound at scale. Inspired by the computational methodologies of many-body physics, we introduce LoRe, a training-free, inference-time drop-in wrapper that enforces per-step interaction-evaluation budgeting: at each iteration, it evaluates only a fixed fraction of interactions by dynamically routing computation to high-conflict or high-uncertainty interactions, instead of using a fixed sparsification (e.g., static kNN graphs or static masks). Under fully inclusive end-to-end wall-clock accounting, LoRe substantially improves scalability on the Maximum Independent Set (MIS) problem, extending feasible inference more than $3\times$ beyond the baseline's out-of-memory limit, delivering a $\sim 8\times$ speedup and a $\sim 12\times$ peak-memory reduction, with solution quality preserved in this regime. Demonstrating cross-task generality on the large-scale Traveling Salesperson Problem (TSP) and zero-shot robustness to topology shifts, LoRe achieves a $\sim 15\times$ speedup at $n=1000$ with a $44\times$ memory reduction and competitive tour quality.