Towards Generalization-Oriented Models for Vehicle Routing Problems with Mixture-of-Experts

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited out-of-distribution generalization of deep reinforcement learning (DRL) approaches to vehicle routing problems, which often stems from training on data drawn from a single distribution. To overcome this limitation, the authors propose a modular policy architecture comprising three key components: a Residual Refinement Expert (R2E) module to enhance model expressiveness, instance-level gating (IG) for distribution-aware routing decisions, and a dynamic weight adaptation (DWA) strategy that enables effective training across mixed data distributions. The proposed method achieves state-of-the-art performance on both in-distribution and out-of-distribution evaluations across synthetic and benchmark datasets. Furthermore, it integrates seamlessly into existing DRL frameworks, offering a practical and effective means to improve generalization without requiring architectural overhauls.
📝 Abstract
In recent years, Deep Reinforcement Learning (DRL) has achieved substantial progress on Vehicle Routing Problems (VRPs). However, existing DRL-based methods are typically trained on instances generated from a uniform distribution, which limits their performance under real-world distribution shifts. In this paper, we aim to develop a generalization-oriented model that partitions the policy network into multiple modules and adaptively recombines modules to form specific policies during inference. Specifically, we propose Residual Refined Experts with Instance-level Gating (R2E-IG) to improve cross-distribution generalization. Our contributions are threefold: (1) We introduce a Residual Refined Expert (R2E) architecture that enhance expert expressiveness via residual refinement; (2) We design an instance-level gating mechanism that learns distribution-aware instance representations and routes inputs to suitable modules; (3) We propose a mixed-distribution training mechanism equipped with Dynamic Weight Adaption (DWA), which dynamically reweights training data from different distributions to emphasize more informative ones. Extensive experiments show that R2E-IG achieves competitive performance against state-of-the-art baselines on both in-distribution and out-of-distribution instances across synthetic and benchmark datasets. Moreover, R2E-IG is generic and can be easily integrated into existing DRL-based methods to further improve performance.
Problem

Research questions and friction points this paper is trying to address.

Vehicle Routing Problems
Distribution Shift
Generalization
Deep Reinforcement Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts
Generalization
Vehicle Routing Problem
Instance-level Gating
Dynamic Weight Adaptation
🔎 Similar Papers
No similar papers found.
Changhao Miao
Changhao Miao
Beijing Institute of Technology
Machine LearningOptimization
Y
Yuntian Zhang
State Key Laboratory of Autonomous Intelligent Unmanned Systems, Beijing Institute of Technology, Beijing 100081, China; Department of Computer, Control and Management Engineering “Antonio Ruberti”, Sapienza University of Rome, Rome, 00185, Italy
T
Tongyu Wu
State Key Laboratory of Autonomous Intelligent Unmanned Systems, Beijing Institute of Technology, Beijing 100081, China
Fang Deng
Fang Deng
Beijing Institute of Technology
New EnergyIntelligent Information ProcessingIntelligent Wearable System
C
Chen Chen
State Key Laboratory of Autonomous Intelligent Unmanned Systems, Beijing Institute of Technology, Beijing 100081, China; School of AI, Beijing Institute of Technology, Beijing 100081, China