Spectral Collapse Drives Loss of Plasticity in Deep Continual Learning

📅 2025-09-26

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

This work identifies the fundamental mechanism underlying “plasticity loss” in deep neural networks during continual learning: spectral collapse of the Hessian matrix, which degrades the parameter space’s responsiveness to gradients from new tasks. To address this, we propose the τ-trainability theoretical framework—unifying plasticity preservation under a single, principled lens—and develop a Kronecker-factored Hessian spectral analysis method. Building on this, we introduce a joint strategy: (i) preserving effective feature rank to delay spectral collapse, and (ii) applying adaptive L2 regularization to suppress overfitting along uninformative parameter directions. Experiments demonstrate substantial improvements in plasticity retention and cross-task generalization across diverse continual learning and reinforcement learning benchmarks. Our approach provides an interpretable, scalable, and theoretically grounded paradigm for plasticity regulation—bridging mechanistic insight with practical algorithmic design.

Technology Category

Application Category

📝 Abstract

We investigate why deep neural networks suffer from emph{loss of plasticity} in deep continual learning, failing to learn new tasks without reinitializing parameters. We show that this failure is preceded by Hessian spectral collapse at new-task initialization, where meaningful curvature directions vanish and gradient descent becomes ineffective. To characterize the necessary condition for successful training, we introduce the notion of $τ$-trainability and show that current plasticity preserving algorithms can be unified under this framework. Targeting spectral collapse directly, we then discuss the Kronecker factored approximation of the Hessian, which motivates two regularization enhancements: maintaining high effective feature rank and applying $L2$ penalties. Experiments on continual supervised and reinforcement learning tasks confirm that combining these two regularizers effectively preserves plasticity.

Problem

Research questions and friction points this paper is trying to address.

Investigates loss of plasticity in deep neural networks during continual learning

Identifies Hessian spectral collapse as cause of vanishing curvature directions

Proposes regularization methods to maintain feature rank and preserve plasticity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hessian spectral collapse analysis for plasticity loss

Kronecker factored Hessian approximation for regularization

Combining feature rank and L2 penalties preserves plasticity

🔎 Similar Papers

No similar papers found.