Lifelong Learning with Behavior Consolidation for Vehicle Routing

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address catastrophic forgetting and insufficient zero-shot generalization in neural Vehicle Routing Problem (VRP) solvers under lifelong learning, this paper proposes a lifelong learning framework for multi-distribution, multi-scale VRP tasks. The method introduces a behavior crystallization mechanism: high-quality solution behaviors from historical tasks are stored in a replay buffer, and decision-oriented behavior alignment is employed to consolidate prior knowledge; additionally, a confidence-aware weighting strategy assigns higher crystallization weights to low-confidence decisions, dynamically balancing stability and plasticity. The framework follows a sequential task-learning paradigm and is compatible with mainstream neural solver architectures. Evaluated on CVRP and TSP benchmarks, the approach significantly mitigates forgetting, maintains stable performance across all previously encountered tasks, and enhances cross-task generalization capability.

Technology Category

Application Category

📝 Abstract
Recent neural solvers have demonstrated promising performance in learning to solve routing problems. However, existing studies are primarily based on one-off training on one or a set of predefined problem distributions and scales, i.e., tasks. When a new task arises, they typically rely on either zero-shot generalization, which may be poor due to the discrepancies between the new task and the training task(s), or fine-tuning the pretrained solver on the new task, which possibly leads to catastrophic forgetting of knowledge acquired from previous tasks. This paper explores a novel lifelong learning paradigm for neural VRP solvers, where multiple tasks with diverse distributions and scales arise sequentially over time. Solvers are required to effectively and efficiently learn to solve new tasks while maintaining their performance on previously learned tasks. Consequently, a novel framework called Lifelong Learning Router with Behavior Consolidation (LLR-BC) is proposed. LLR-BC consolidates prior knowledge effectively by aligning behaviors of the solver trained on a new task with the buffered ones in a decision-seeking way. To encourage more focus on crucial experiences, LLR-BC assigns greater consolidated weights to decisions with lower confidence. Extensive experiments on capacitated vehicle routing problems and traveling salesman problems demonstrate LLR-BC's effectiveness in training high-performance neural solvers in a lifelong learning setting, addressing the catastrophic forgetting issue, maintaining their plasticity, and improving zero-shot generalization ability.
Problem

Research questions and friction points this paper is trying to address.

Preventing catastrophic forgetting in lifelong neural routing solvers
Maintaining solver performance across sequential diverse tasks
Improving zero-shot generalization for vehicle routing problems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lifelong learning paradigm for neural VRP solvers
Behavior consolidation aligns solver decisions across tasks
Weight consolidation prioritizes low-confidence critical decisions
🔎 Similar Papers
No similar papers found.
Jiyuan Pei
Jiyuan Pei
Victoria University of Wellington
Adapative Operator SelectionEvolutionary ComputationVehicle Routing
Y
Yi Mei
Data Science and Artificial Intelligence & School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand
J
Jialin Liu
School of Data Science, Lingnan University, Hong Kong SAR, China
M
Mengjie Zhang
Data Science and Artificial Intelligence & School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand
X
Xin Yao
School of Data Science, Lingnan University, Hong Kong SAR, China