Controlled Low-Rank Adaptation with Subspace Regularization for Continued Training on Large Language Models

📅 2024-10-22
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from catastrophic forgetting during continual learning—i.e., significant performance degradation on previously learned tasks when adapting to new ones. To address this, we propose Controlled Low-Rank Adaptation (CLoRA), the first LoRA-based method to incorporate null-space direction constraints into adapter design. By enforcing subspace regularization on adapter outputs, CLoRA explicitly bounds output perturbations, thereby mitigating forgetting without compromising model capacity. As a parameter-efficient fine-tuning (PEFT) approach, CLoRA jointly optimizes adaptability to new tasks and stability on old ones. Experiments across single-stage fine-tuning and continual learning benchmarks demonstrate that CLoRA consistently outperforms standard LoRA: it reduces average performance drop on old tasks by 37%, while maintaining competitive accuracy on new tasks—effectively balancing model expressivity and forgetting suppression.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) exhibit remarkable capabilities in natural language processing but face catastrophic forgetting when learning new tasks, where adaptation to a new domain leads to a substantial decline in performance on previous tasks. In this paper, we propose Controlled LoRA (CLoRA), a sub-space regularization method on LoRA structure. Aiming to reduce the scale of output change while introduce minimal constraint on model capacity, CLoRA imposes constraint on the direction of updating matrix's null space. Experimental results on one-stage LLM finetuning tasks and continual learning settings highlight the superority of CLoRA as a effective parameter efficient finetuning method with catastrophic forgetting mitigating.Further investigation for model parameters indicates that CLoRA effectively balances the trade-off between model capacity and degree of forgetting.
Problem

Research questions and friction points this paper is trying to address.

Mitigate catastrophic forgetting in large language models
Balance model capacity and forgetting trade-off
Improve parameter-efficient finetuning with subspace regularization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Controlled LoRA with subspace regularization
Constraint on null space direction
Balances capacity and forgetting trade-off
🔎 Similar Papers
No similar papers found.
Yuheng Lu
Yuheng Lu
Peking University
3D Computer Vision
B
Bingshuo Qian
Beijing University of Posts and Telecommunications
C
Caixia Yuan
Beijing University of Posts and Telecommunications
Huixing Jiang
Huixing Jiang
Meituan Group
NLP
X
Xiaojie Wang
Beijing University of Posts and Telecommunications