🤖 AI Summary
This work addresses four core bottlenecks in neural combinatorial optimization (NCO) for the vehicle routing problem (VRP): poor generalization, limited scalability to large-scale instances, incompatibility with diverse VRP variants, and lack of comparability with classical operations research algorithms. To tackle these, we propose the first four-dimensional taxonomy—categorizing methods as constructive, improvement, single-prediction, or iterative-prediction—and systematically survey and evaluate state-of-the-art approaches. Through empirical analysis integrating reinforcement learning, supervised learning, and unsupervised learning, we identify shared limitations and introduce a novel “multi-dimensional synergistic breakthrough” paradigm. We further establish a dynamically updated, open-source living knowledge base. Our contributions include: (i) the most comprehensive NCO-VRP survey to date; (ii) a curated list of key open challenges; and (iii) community infrastructure enabling standardized benchmarking and sustained methodological advancement.
📝 Abstract
Although several surveys on Neural Combinatorial Optimization (NCO) solvers specifically designed to solve Vehicle Routing Problems (VRPs) have been conducted, they did not cover the state-of-the-art (SOTA) NCO solvers emerged recently. More importantly, to establish a comprehensive and up-to-date taxonomy of NCO solvers, we systematically review relevant publications and preprints, categorizing them into four distinct types, namely Learning to Construct, Learning to Improve, Learning to Predict-Once, and Learning to Predict-Multiplicity solvers. Subsequently, we present the inadequacies of the SOTA solvers, including poor generalization, incapability to solve large-scale VRPs, inability to address most types of VRP variants simultaneously, and difficulty in comparing these NCO solvers with the conventional Operations Research algorithms. Simultaneously, we discuss on-going efforts, identify open inadequacies, as well as propose promising and viable directions to overcome these inadequacies. Notably, existing efforts focus on only one or two of these inadequacies, with none attempting to address all of them concurrently. In addition, we compare the performance of representative NCO solvers from the Reinforcement, Supervised, and Unsupervised Learning paradigms across VRPs of varying scales. Finally, following the proposed taxonomy, we provide an accompanying web page as a live repository for NCO solvers. Through this survey and the live repository, we aim to foster further advancements in the NCO community.