OPLoRA: Orthogonal Projection LoRA Prevents Catastrophic Forgetting during Parameter-Efficient Fine-Tuning

πŸ“… 2025-10-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
LoRA-based efficient fine-tuning often suffers from catastrophic forgetting, primarily because low-rank updates perturb the dominant singular directions of pretrained weights. To address this, we propose OrthoLoRAβ€”a novel method introducing bilateral orthogonal projections onto the left and right singular subspaces (spanned by $U_k$ and $V_k$, respectively). We theoretically prove that OrthoLoRA exactly preserves the top-$k$ singular triplets of the pretrained weight matrix. We further define a subspace interference metric $ ho_k$ to quantify forgetting risk. OrthoLoRA freezes the backbone weights via SVD and enforces projection constraints $P_L = I - U_kU_k^ op$ and $P_R = I - V_kV_k^ op$, enabling parameter-efficient adaptation while guaranteeing knowledge retention. Experiments on LLaMA-2 7B and Qwen2.5 7B demonstrate that OrthoLoRA significantly mitigates forgetting, achieving performance on par with or superior to standard LoRA across commonsense reasoning, mathematical problem solving, and code generation tasks.

Technology Category

Application Category

πŸ“ Abstract
Low-Rank Adaptation (LoRA) enables efficient fine-tuning of large language models but suffers from catastrophic forgetting when learned updates interfere with the dominant singular directions that encode essential pre-trained knowledge. We propose Orthogonal Projection LoRA (OPLoRA), a theoretically grounded approach that prevents this interference through double-sided orthogonal projections. By decomposing frozen weights via SVD, OPLoRA constrains LoRA updates to lie entirely within the orthogonal complement of the top-$k$ singular subspace using projections $P_L = I - U_k U_k^ op$ and $P_R = I - V_k V_k^ op$. We prove that this construction exactly preserves the top-$k$ singular triples, providing mathematical guarantees for knowledge retention. To quantify subspace interference, we introduce $ρ_k$, a metric measuring update alignment with dominant directions. Extensive experiments across commonsense reasoning, mathematics, and code generation demonstrate that OPLoRA significantly reduces forgetting while maintaining competitive task-specific performance on LLaMA-2 7B and Qwen2.5 7B, establishing orthogonal projection as an effective mechanism for knowledge preservation in parameter-efficient fine-tuning.
Problem

Research questions and friction points this paper is trying to address.

Preventing catastrophic forgetting in LoRA fine-tuning of language models
Constraining parameter updates to orthogonal subspaces via projections
Preserving essential pre-trained knowledge while enabling task adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Orthogonal projection prevents catastrophic forgetting
Constrains updates to orthogonal complement via SVD
Preserves top-k singular triples mathematically
πŸ”Ž Similar Papers
No similar papers found.