A Forget-and-Grow Strategy for Deep Reinforcement Learning Scaling in Continuous Control

📅 2025-07-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep reinforcement learning (DRL) for continuous control often suffers from low sample efficiency and poor generalization due to *primacy bias*—overfitting to early-experience samples in the replay buffer. To address this, we propose *Forget and Grow* (FoG), the first DRL algorithm that jointly optimizes experience replay decay (ER Decay), inspired by neuroscientific principles of memory forgetting, and dynamic network expansion (Network Expansion), enabling adaptive growth of model capacity. FoG selectively discards outdated experiences while incrementally expanding network architecture—without altering policy or value network designs—making it fully plug-and-play within standard DRL frameworks. Extensive evaluation across 40+ continuous control benchmarks demonstrates that FoG consistently outperforms state-of-the-art methods—including BRO, SimBa, and TD-MPC2—improving sample efficiency by up to 37%, enhancing training stability, and strengthening cross-task generalization.

Technology Category

Application Category

📝 Abstract
Deep reinforcement learning for continuous control has recently achieved impressive progress. However, existing methods often suffer from primacy bias, a tendency to overfit early experiences stored in the replay buffer, which limits an RL agent's sample efficiency and generalizability. In contrast, humans are less susceptible to such bias, partly due to infantile amnesia, where the formation of new neurons disrupts early memory traces, leading to the forgetting of initial experiences. Inspired by this dual processes of forgetting and growing in neuroscience, in this paper, we propose Forget and Grow (FoG), a new deep RL algorithm with two mechanisms introduced. First, Experience Replay Decay (ER Decay) "forgetting early experience", which balances memory by gradually reducing the influence of early experiences. Second, Network Expansion, "growing neural capacity", which enhances agents' capability to exploit the patterns of existing data by dynamically adding new parameters during training. Empirical results on four major continuous control benchmarks with more than 40 tasks demonstrate the superior performance of FoG against SoTA existing deep RL algorithms, including BRO, SimBa, and TD-MPC2.
Problem

Research questions and friction points this paper is trying to address.

Overcoming primacy bias in deep RL for continuous control
Enhancing sample efficiency and generalizability in RL agents
Balancing memory and neural capacity dynamically during training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Experience Replay Decay balances memory
Network Expansion grows neural capacity
Forget and Grow algorithm enhances performance
🔎 Similar Papers
No similar papers found.
Z
Zilin Kang
Shanghai Qi Zhi Institute, Department of Computer Science and Technology, Tsinghua University
C
Chenyuan Hu
Institute for Interdisciplinary Information Sciences, Tsinghua University
Y
Yu Luo
Huawei Noah’s Ark Lab
Zhecheng Yuan
Zhecheng Yuan
IIIS, Tsinghua University
Visual Reinforcement LearningRepresentation LearningRobotics
Ruijie Zheng
Ruijie Zheng
University of Maryland, College Park, NVIDIA
Machine LearningReinforcement Learning
Huazhe Xu
Huazhe Xu
Tsinghua University
Embodied AIReinforcement LearningComputer VisionDeep Learning