Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low forward transfer efficiency and severe catastrophic forgetting in lifelong robotic learning, this paper proposes a task-agnostic dynamic mixture-of-experts (MoE) framework. Methodologically, it integrates three key components: (1) a progressive low-rank expert library coupled with a lightweight router for parameter-efficient policy generation; (2) coefficient replay—a novel rehearsal technique that mitigates forgetting via precise retrieval of frozen expert weights, outperforming demonstration-based replay; and (3) a unified design combining low-rank decomposition, dynamic gating-based routing, and modular parameter-efficient fine-tuning (PEFT). Evaluated on the LIBERO benchmark, our approach achieves state-of-the-art success rates using fewer than 0.5M trainable parameters and minimal storage overhead. Crucially, it is the first method to simultaneously enable strong forward transfer and robust forgetting resistance—without requiring task identifiers.

Technology Category

Application Category

📝 Abstract
A generalist agent must continuously learn and adapt throughout its lifetime, achieving efficient forward transfer while minimizing catastrophic forgetting. Previous work within the dominant pretrain-then-finetune paradigm has explored parameter-efficient fine-tuning for single-task adaptation, effectively steering a frozen pretrained model with a small number of parameters. However, in the context of lifelong learning, these methods rely on the impractical assumption of a test-time task identifier and restrict knowledge sharing among isolated adapters. To address these limitations, we propose Dynamic Mixture of Progressive Parameter-Efficient Expert Library (DMPEL) for lifelong robot learning. DMPEL progressively learn a low-rank expert library and employs a lightweight router to dynamically combine experts into an end-to-end policy, facilitating flexible behavior during lifelong adaptation. Moreover, by leveraging the modular structure of the fine-tuned parameters, we introduce coefficient replay to guide the router in accurately retrieving frozen experts for previously encountered tasks, thereby mitigating catastrophic forgetting. This method is significantly more storage- and computationally-efficient than applying demonstration replay to the entire policy. Extensive experiments on the lifelong manipulation benchmark LIBERO demonstrate that our framework outperforms state-of-the-art lifelong learning methods in success rates across continual adaptation, while utilizing minimal trainable parameters and storage.
Problem

Research questions and friction points this paper is trying to address.

Enable lifelong robot learning without task identifiers
Facilitate knowledge sharing among adaptive modules
Prevent catastrophic forgetting with minimal storage
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic mixture of progressive parameter-efficient expert library
Lightweight router combines experts into policy
Coefficient replay mitigates catastrophic forgetting efficiently
🔎 Similar Papers
No similar papers found.