Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optimization

📅 2024-06-24
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
For NP-hard routing problems (e.g., TSP, CVRP), existing neural solvers lack real-time adaptability during inference: they cannot dynamically leverage newly available computational budget or instance-specific information. This work proposes MEMENTO, the first neural solver framework incorporating a dynamic memory mechanism into inference. MEMENTO online updates action distributions by integrating historical decision feedback, enabling zero-shot ensemble of diverse solvers without fine-tuning or pre-trained policy collections. The method unifies reinforcement learning, autoregressive modeling, memory-augmented networks, and online policy adaptation. Evaluated on 12 benchmark tasks, MEMENTO achieves new state-of-the-art (SOTA) results on 11—significantly outperforming both tree search and policy-gradient fine-tuning approaches. Crucially, it demonstrates strong scalability, data efficiency, and solution quality on large-scale TSP and CVRP instances.

Technology Category

Application Category

📝 Abstract
Combinatorial Optimization is crucial to numerous real-world applications, yet still presents challenges due to its (NP-)hard nature. Amongst existing approaches, heuristics often offer the best trade-off between quality and scalability, making them suitable for industrial use. While Reinforcement Learning (RL) offers a flexible framework for designing heuristics, its adoption over handcrafted heuristics remains incomplete within industrial solvers. Existing learned methods still lack the ability to adapt to specific instances and fully leverage the available computational budget. The current best methods either rely on a collection of pre-trained policies, or on data-inefficient fine-tuning; hence failing to fully utilize newly available information within the constraints of the budget. In response, we present MEMENTO, an approach that leverages memory to improve the adaptation of neural solvers at inference time. MEMENTO enables updating the action distribution dynamically based on the outcome of previous decisions. We validate its effectiveness on benchmark problems, in particular Traveling Salesman and Capacitated Vehicle Routing, demonstrating its superiority over tree-search and policy-gradient fine-tuning; and showing it can be zero-shot combined with diversity-based solvers. We successfully train all RL auto-regressive solvers on large instances, and show that MEMENTO can scale and is data-efficient. Overall, MEMENTO enables to push the state-of-the-art on 11 out of 12 evaluated tasks.
Problem

Research questions and friction points this paper is trying to address.

Enhancing neural solvers' adaptability to specific routing instances
Improving computational budget utilization through memory mechanisms
Overcoming limitations of pre-trained policies and RL fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory-enhanced neural solvers improve routing problem solutions
Dynamic action adjustment using online decision outcome data
Zero-shot combination with diversity-based solvers demonstrated
🔎 Similar Papers
No similar papers found.
F
Félix Chalumeau
InstaDeep
R
Refiloe Shabe
InstaDeep
N
Noah de Nicola
University of Cape Town
Arnu Pretorius
Arnu Pretorius
Staff Research Scientist, InstaDeep Ltd
Reinforcement LearningMulti-Agent Reinforcement Learning
T
Thomas D. Barrett
InstaDeep
Nathan Grinsztajn
Nathan Grinsztajn
Cohere
reinforcement learningLLMscombinatorial optimization