KeepLoRA: Continual Learning with Residual Gradient Adaptation

📅 2026-01-27
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of balancing the retention of pre-trained knowledge and previously learned tasks while effectively acquiring new capabilities in continual learning of vision-language models. The authors propose a subspace disentanglement approach that leverages principal component analysis to decompose the parameter space into a shared principal subspace and task-specific residual subspaces. Low-rank adaptation (LoRA) is applied within the residual subspaces, and a residual gradient adaptation mechanism is introduced to project gradients of new tasks onto directions orthogonal to both the principal subspace and features of historical tasks. This design enables effective disentanglement among pre-trained, past-task, and new-task knowledge. The method achieves state-of-the-art performance across multiple continual learning benchmarks, significantly improving the trade-off between knowledge retention and adaptation to new tasks.

Technology Category

Application Category

📝 Abstract
Continual learning for pre-trained vision-language models requires balancing three competing objectives: retaining pre-trained knowledge, preserving knowledge from a sequence of learned tasks, and maintaining the plasticity to acquire new knowledge. This paper presents a simple but effective approach called KeepLoRA to effectively balance these objectives. We first analyze the knowledge retention mechanism within the model parameter space and find that general knowledge is mainly encoded in the principal subspace, while task-specific knowledge is encoded in the residual subspace. Motivated by this finding, KeepLoRA learns new tasks by restricting LoRA parameter updates in the residual subspace to prevent interfering with previously learned capabilities. Specifically, we infuse knowledge for a new task by projecting its gradient onto a subspace orthogonal to both the principal subspace of pre-trained model and the dominant directions of previous task features. Our theoretical and empirical analyses confirm that KeepLoRA balances the three objectives and achieves state-of-the-art performance. The implementation code is available at https://github.com/MaolinLuo/KeepLoRA.
Problem

Research questions and friction points this paper is trying to address.

continual learning
vision-language models
knowledge retention
plasticity
task interference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Continual Learning
LoRA
Residual Subspace
Knowledge Retention
Gradient Projection
🔎 Similar Papers
No similar papers found.
M
Mao-Lin Luo
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
Z
Zi-Hao Zhou
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
Y
Yi-Lin Zhang
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
Yuanyu Wan
Yuanyu Wan
Zhejiang University
Machine LearningOnline LearningDistributed Optimization
Tong Wei
Tong Wei
Southeast University
Machine Learning
Min-Ling Zhang
Min-Ling Zhang
Professor, School of Computer Science and Engineering, Southeast University, China
Artificial IntelligenceMachine LearningData Mining