Towards Generalization-Oriented Models for Vehicle Routing Problems with Mixture-of-Experts

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the limited out-of-distribution generalization of deep reinforcement learning (DRL) approaches to vehicle routing problems, which often stems from training on data drawn from a single distribution. To overcome this limitation, the authors propose a modular policy architecture comprising three key components: a Residual Refinement Expert (R2E) module to enhance model expressiveness, instance-level gating (IG) for distribution-aware routing decisions, and a dynamic weight adaptation (DWA) strategy that enables effective training across mixed data distributions. The proposed method achieves state-of-the-art performance on both in-distribution and out-of-distribution evaluations across synthetic and benchmark datasets. Furthermore, it integrates seamlessly into existing DRL frameworks, offering a practical and effective means to improve generalization without requiring architectural overhauls.

📝 Abstract

In recent years, Deep Reinforcement Learning (DRL) has achieved substantial progress on Vehicle Routing Problems (VRPs). However, existing DRL-based methods are typically trained on instances generated from a uniform distribution, which limits their performance under real-world distribution shifts. In this paper, we aim to develop a generalization-oriented model that partitions the policy network into multiple modules and adaptively recombines modules to form specific policies during inference. Specifically, we propose Residual Refined Experts with Instance-level Gating (R2E-IG) to improve cross-distribution generalization. Our contributions are threefold: (1) We introduce a Residual Refined Expert (R2E) architecture that enhance expert expressiveness via residual refinement; (2) We design an instance-level gating mechanism that learns distribution-aware instance representations and routes inputs to suitable modules; (3) We propose a mixed-distribution training mechanism equipped with Dynamic Weight Adaption (DWA), which dynamically reweights training data from different distributions to emphasize more informative ones. Extensive experiments show that R2E-IG achieves competitive performance against state-of-the-art baselines on both in-distribution and out-of-distribution instances across synthetic and benchmark datasets. Moreover, R2E-IG is generic and can be easily integrated into existing DRL-based methods to further improve performance.

Problem

Research questions and friction points this paper is trying to address.

Vehicle Routing Problems

Distribution Shift

Generalization

Deep Reinforcement Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts

Generalization

Vehicle Routing Problem

Instance-level Gating