Knowledge Retention for Continual Model-Based Reinforcement Learning

📅 2025-03-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses continual reinforcement learning scenarios where the reward function evolves across tasks while the state space and environment dynamics remain invariant. We propose a data-free, incremental world model construction method. Our approach comprises two core components: (1) a synthetic experience replay mechanism that leverages generative models to replay high-information-value state transitions; and (2) an exploratory memory recovery mechanism that jointly optimizes intrinsic reward-driven exploration and model prediction error minimization to consolidate dynamical knowledge online. These components synergistically mitigate catastrophic forgetting, enabling continual world model refinement and generalization. Evaluated on multi-task continual learning benchmarks, our method significantly improves cross-task transfer performance and long-horizon generalization of world models, outperforming existing baselines.

Technology Category

Application Category

📝 Abstract
We propose DRAGO, a novel approach for continual model-based reinforcement learning aimed at improving the incremental development of world models across a sequence of tasks that differ in their reward functions but not the state space or dynamics. DRAGO comprises two key components: Synthetic Experience Rehearsal, which leverages generative models to create synthetic experiences from past tasks, allowing the agent to reinforce previously learned dynamics without storing data, and Regaining Memories Through Exploration, which introduces an intrinsic reward mechanism to guide the agent toward revisiting relevant states from prior tasks. Together, these components enable the agent to maintain a comprehensive and continually developing world model, facilitating more effective learning and adaptation across diverse environments. Empirical evaluations demonstrate that DRAGO is able to preserve knowledge across tasks, achieving superior performance in various continual learning scenarios.
Problem

Research questions and friction points this paper is trying to address.

Improves incremental world model development in reinforcement learning
Enhances knowledge retention across tasks with different reward functions
Utilizes synthetic experiences and intrinsic rewards for continual learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic Experience Rehearsal using generative models
Intrinsic reward mechanism for memory retention
Continual world model development across tasks