Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture

📅 2024-06-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the critical limitation of neural solvers—poor generalization across problem scales and distributions—in Vehicle Routing Problems (VRPs). We propose two key innovations: (1) an Entropy-based Scaling Factor (ESF), which dynamically calibrates Transformer attention weights to adapt to varying problem sizes; and (2) a Distribution-Specific (DS) decoder, a lightweight auxiliary branch that explicitly models distributional characteristics, thereby expanding representational capacity and supporting diverse VRP variants. Built upon a Transformer architecture, our method integrates entropy-driven attention scaling, multi-distribution decoding ensembles, and joint training on synthetic and real-world data. Extensive experiments on TSP, CVRP, and other VRP variants demonstrate substantial improvements over seven state-of-the-art baselines, with marked gains in cross-scale and cross-distribution generalization. Both components incur negligible computational overhead and can be readily integrated as plug-in enhancements to existing generalization strategies.

Technology Category

Application Category

📝 Abstract
Neural models produce promising results when solving Vehicle Routing Problems (VRPs), but often fall short in generalization. Recent attempts to enhance model generalization often incur unnecessarily large training cost or cannot be directly applied to other models solving different VRP variants. To address these issues, we take a novel perspective on model architecture in this study. Specifically, we propose a plug-and-play Entropy-based Scaling Factor (ESF) and a Distribution-Specific (DS) decoder to enhance the size and distribution generalization, respectively. ESF adjusts the attention weight pattern of the model towards familiar ones discovered during training when solving VRPs of varying sizes. The DS decoder explicitly models VRPs of multiple training distribution patterns through multiple auxiliary light decoders, expanding the model representation space to encompass a broader range of distributional scenarios. We conduct extensive experiments on both synthetic and widely recognized real-world benchmarking datasets and compare the performance with seven baseline models. The results demonstrate the effectiveness of using ESF and DS decoder to obtain a more generalizable model and showcase their applicability to solve different VRP variants, i.e., travelling salesman problem and capacitated VRP. Notably, our proposed generic components require minimal computational resources, and can be effortlessly integrated into conventional generalization strategies to further elevate model generalization.
Problem

Research questions and friction points this paper is trying to address.

Enhance generalization of neural VRP solvers
Reduce training costs for VRP variants
Improve model adaptability to different VRP sizes and distributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposed plug-and-play Entropy-based Scaling Factor (ESF)
Introduced Distribution-Specific (DS) decoder for generalization
Enhanced model generalization with minimal computational resources
🔎 Similar Papers
No similar papers found.
Yubin Xiao
Yubin Xiao
Jilin University
Neural Combinatorial optimization
D
Di Wang
Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Nanyang Technological University, 639956, Singapore; WeBank-NTU Joint Research Institute on Fintech, Nanyang Technological University, 639956, Singapore
X
Xuan Wu
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Y
Yuesong Wu
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
B
Boyang Li
School of Computer Science and Engineering, Nanyang Technological University, 639956, Singapore
W
Wei Du
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
L
Liupu Wang
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Y
You Zhou
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, 130012, China