Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn

📅 2025-05-31

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work addresses the degradation of agent plasticity—i.e., the capacity to adapt to novel tasks or environments—in continual reinforcement learning (CRL). We propose modeling plasticity loss through the lens of *churn*: output instability on out-of-distribution data during mini-batch training. We theoretically and empirically reveal that churn amplification stems from rank decay of the Neural Tangent Kernel (NTK). Building on this insight, we design C-CHAIN, a theoretically grounded algorithm that mitigates NTK rank collapse via adaptive gradient scaling, thereby stabilizing network dynamics. C-CHAIN integrates seamlessly into mainstream CRL frameworks and demonstrates significant improvements in task generalization and long-term performance across diverse benchmarks—including OpenAI Gym Control, ProcGen, DeepMind Control Suite, and MinAtar—consistently outperforming state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract

Plasticity, or the ability of an agent to adapt to new tasks, environments, or distributions, is crucial for continual learning. In this paper, we study the loss of plasticity in deep continual RL from the lens of churn: network output variability for out-of-batch data induced by mini-batch training. We demonstrate that (1) the loss of plasticity is accompanied by the exacerbation of churn due to the gradual rank decrease of the Neural Tangent Kernel (NTK) matrix; (2) reducing churn helps prevent rank collapse and adjusts the step size of regular RL gradients adaptively. Moreover, we introduce Continual Churn Approximated Reduction (C-CHAIN) and demonstrate it improves learning performance and outperforms baselines in a diverse range of continual learning environments on OpenAI Gym Control, ProcGen, DeepMind Control Suite, and MinAtar benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Mitigating plasticity loss in continual reinforcement learning

Reducing churn to prevent neural network rank collapse

Improving adaptability in diverse continual learning environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reduces churn to mitigate plasticity loss

Adaptively adjusts RL gradient step size

Introduces C-CHAIN for continual learning

🔎 Similar Papers

Reset & Distill: A Recipe for Overcoming Negative Transfer in Continual Reinforcement Learning