🤖 AI Summary
This work addresses the poor scalability and low computational efficiency of existing neural approaches when tackling vehicle routing problems (VRPs) on graphs with multi-edges. To overcome these limitations, the authors propose the Node-Edge Policy Decomposition Framework (NEPF), which, for the first time, decouples the routing policy into two distinct stages: node sequencing and edge selection. The framework integrates pre-encoded edge aggregation, non-autoregressive modeling, and hierarchical reinforcement learning to enable efficient joint training. Empirical results demonstrate that NEPF substantially improves model scalability and inference speed, achieving solution quality on par with or superior to state-of-the-art methods across six VRP variants while significantly reducing training overhead.
📝 Abstract
Most neural methods for Vehicle Routing Problems (VRPs) are limited to Euclidean settings or simple graphs. In this work, we instead consider multigraphs, where parallel edges represent distinct travel options with varying trade-offs (e.g., distance vs time). Few methods are designed for such formulations and those that do exist face major scalability issues. We mitigate these scalability issues via a Node-Edge Policy Factorization (NEPF) approach, which splits the routing policy into a node permutation stage and an edge selection stage. To enable the decomposition, we introduce a pre-encoding edge aggregation scheme and a non-autoregressive architecture for the edge stage, as well as a hierarchical reinforcement learning method to train the stages jointly. Our experiments across six VRP variants demonstrate that NEPF matches or outperforms the state-of-the-art in terms of solution quality, while being significantly faster in training and inference.