AltNet: Addressing the Plasticity-Stability Dilemma in Reinforcement Learning

📅 2025-11-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In reinforcement learning, neural networks often suffer from catastrophic forgetting—loss of plasticity during training—which impairs continual learning. While existing parameter reset methods can restore plasticity, they induce severe performance degradation, hindering real-world deployment. To address this, we propose AltNet, a novel dual-network alternating framework: an active network performs online policy learning, while a passive network conducts offline policy learning and periodically takes over control, enabling implicit, zero-cost plasticity reset. This design fully decouples plasticity recovery from performance preservation, resolving the plasticity–stability dilemma without compromising operational robustness. Evaluated on multiple high-dimensional continuous-control tasks in the DeepMind Control Suite, AltNet achieves superior sample efficiency and final performance compared to state-of-the-art baselines and advanced reset methods, while eliminating performance fluctuations entirely.

Technology Category

Application Category

📝 Abstract
Neural networks have shown remarkable success in supervised learning when trained on a single task using a fixed dataset. However, when neural networks are trained on a reinforcement learning task, their ability to continue learning from new experiences declines over time. This decline in learning ability is known as plasticity loss. To restore plasticity, prior work has explored periodically resetting the parameters of the learning network, a strategy that often improves overall performance. However, such resets come at the cost of a temporary drop in performance, which can be dangerous in real-world settings. To overcome this instability, we introduce AltNet, a reset-based approach that restores plasticity without performance degradation by leveraging twin networks. The use of twin networks anchors performance during resets through a mechanism that allows networks to periodically alternate roles: one network learns as it acts in the environment, while the other learns off-policy from the active network's interactions and a replay buffer. At fixed intervals, the active network is reset and the passive network, having learned from prior experiences, becomes the new active network. AltNet restores plasticity, improving sample efficiency and achieving higher performance, while avoiding performance drops that pose risks in safety-critical settings. We demonstrate these advantages in several high-dimensional control tasks from the DeepMind Control Suite, where AltNet outperforms various relevant baseline methods, as well as state-of-the-art reset-based techniques.
Problem

Research questions and friction points this paper is trying to address.

Addresses plasticity loss in reinforcement learning neural networks
Avoids performance drops during network resets in RL
Enhances sample efficiency and safety in control tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

AltNet uses twin networks to alternate roles
Resets active network periodically without performance drop
Learns off-policy from replay buffer and interactions
🔎 Similar Papers
No similar papers found.
M
Mansi Maheshwari
University of Massachusetts Amherst, United States
J
John C. Raisbeck
University of Massachusetts Amherst, United States
Bruno Castro da Silva
Bruno Castro da Silva
University of Massachusetts
artificial intelligencemachine learningreinforcement learning