🤖 AI Summary
Existing deep learning solvers for the Vehicle Routing Problem (VRP) suffer from limited generalization across problem scales, struggling to adapt to unseen instance sizes.
Method: We propose a continual learning framework tailored for multi-scale VRP. It trains models sequentially over increasingly large problem scales and incorporates dual-level regularization—inter-task and intra-task—to jointly constrain parameter updates. Additionally, it integrates behavior cloning and experience replay to mitigate catastrophic forgetting and consolidate optimization strategies at each scale.
Contribution/Results: The framework significantly enhances cross-scale generalization to previously unseen problem sizes. Experiments on diverse VRP benchmarks—including both seen and unseen scales—demonstrate consistent superiority over state-of-the-art methods. Notably, it achieves outstanding performance in zero-shot transfer scenarios, where no fine-tuning is performed on target-scale instances. Our approach thus advances scalable, generalizable deep VRP solving without requiring retraining or architectural modification for new scales.
📝 Abstract
Exploring machine learning techniques for addressing vehicle routing problems has attracted considerable research attention. To achieve decent and efficient solutions, existing deep models for vehicle routing problems are typically trained and evaluated using instances of a single size. This substantially limits their ability to generalize across different problem sizes and thus hampers their practical applicability. To address the issue, we propose a continual learning based framework that sequentially trains a deep model with instances of ascending problem sizes. Specifically, on the one hand, we design an inter-task regularization scheme to retain the knowledge acquired from smaller problem sizes in the model training on a larger size. On the other hand, we introduce an intra-task regularization scheme to consolidate the model by imitating the latest desirable behaviors during training on each size. Additionally, we exploit the experience replay to revisit instances of formerly trained sizes for mitigating the catastrophic forgetting. Experimental results show that our approach achieves predominantly superior performance across various problem sizes (either seen or unseen in the training), as compared to state-of-the-art deep models including the ones specialized for generalizability enhancement. Meanwhile, the ablation studies on the key designs manifest their synergistic effect in the proposed framework.