PLATE: Plasticity-Tunable Efficient Adapters for Geometry-Aware Continual Learning

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses catastrophic forgetting in continual learning with pre-trained models when access to previous task data is prohibited. The authors propose a structured low-rank adaptation method grounded in geometric redundancy of pre-trained weights. By analyzing the intrinsic geometric structure of the pre-trained weight space, they identify a protected subspace for parameter updates and formulate the update as $ \Delta W = BAQ^\top $, where frozen matrices $ B $ and $ Q $ project trainable low-rank matrix $ A $ exclusively onto redundant directions. This approach is the first to leverage geometric redundancy to explicitly locate plasticity regions, enabling a controllable trade-off between plasticity and stability without requiring data replay. Experimental results demonstrate that the method effectively suppresses functional drift and significantly improves retention of performance on prior tasks, even under worst-case scenarios.

Technology Category

Application Category

📝 Abstract

We develop a continual learning method for pretrained models that \emph{requires no access to old-task data}, addressing a practical barrier in foundation model adaptation where pretraining distributions are often unavailable. Our key observation is that pretrained networks exhibit substantial \emph{geometric redundancy}, and that this redundancy can be exploited in two complementary ways. First, redundant neurons provide a proxy for dominant pretraining-era feature directions, enabling the construction of approximately protected update subspaces directly from pretrained weights. Second, redundancy offers a natural bias for \emph{where} to place plasticity: by restricting updates to a subset of redundant neurons and constraining the remaining degrees of freedom, we obtain update families with reduced functional drift on the old-data distribution and improved worst-case retention guarantees. These insights lead to \textsc{PLATE} (\textbf{Pla}sticity-\textbf{T}unable \textbf{E}fficient Adapters), a continual learning method requiring no past-task data that provides explicit control over the plasticity-retention trade-off. PLATE parameterizes each layer with a structured low-rank update $\Delta W = B A Q^\top$, where $B$ and $Q$ are computed once from pretrained weights and kept frozen, and only $A$ is trained on the new task. The code is available at https://github.com/SalesforceAIResearch/PLATE.

Problem

Research questions and friction points this paper is trying to address.

continual learning

plasticity

foundation models

data-free adaptation

geometric redundancy

Innovation

Methods, ideas, or system contributions that make the work stand out.

continual learning

plasticity-retention trade-off

geometric redundancy