🤖 AI Summary
In deep reinforcement learning, policy networks suffer from deteriorating plasticity during training, impairing adaptability in non-stationary environments. To address this, we propose a cognitive science–inspired neural plasticity extension mechanism, introducing the first dynamic network growth paradigm: latent-gradient–driven topological evolution enables elastic structural expansion; adaptive pruning of dormant neurons and experience-replay–guided parameter consolidation jointly mitigate the plasticity–stability dilemma. The mechanism integrates seamlessly into standard policy/value network training pipelines. Evaluated on diverse tasks from MuJoCo and the DeepMind Control Suite, our approach significantly outperforms state-of-the-art methods, achieving superior long-term adaptability and robustness under environmental non-stationarity. It establishes a scalable, biologically grounded neural architecture paradigm for continual reinforcement learning.
📝 Abstract
The loss of plasticity in learning agents, analogous to the solidification of neural pathways in biological brains, significantly impedes learning and adaptation in reinforcement learning due to its non-stationary nature. To address this fundamental challenge, we propose a novel approach, {it Neuroplastic Expansion} (NE), inspired by cortical expansion in cognitive science. NE maintains learnability and adaptability throughout the entire training process by dynamically growing the network from a smaller initial size to its full dimension. Our method is designed with three key components: ( extit{1}) elastic topology generation based on potential gradients, ( extit{2}) dormant neuron pruning to optimize network expressivity, and ( extit{3}) neuron consolidation via experience review to strike a balance in the plasticity-stability dilemma. Extensive experiments demonstrate that NE effectively mitigates plasticity loss and outperforms state-of-the-art methods across various tasks in MuJoCo and DeepMind Control Suite environments. NE enables more adaptive learning in complex, dynamic environments, which represents a crucial step towards transitioning deep reinforcement learning from static, one-time training paradigms to more flexible, continually adapting models.