Fine Tuning without Catastrophic Forgetting via Selective Low Rank Adaptation

📅 2025-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address catastrophic forgetting, degraded out-of-distribution (OOD) generalization, and high computational overhead in large-model domain adaptation, this paper proposes a parameter-efficient fine-tuning method based on selective activation of LoRA modules. Our core innovation is a learnable binary gating function that enables fine-grained, task-aware sparsity in LoRA updates, integrated within the Task Adaptive Parameter Sharing (TAPS) framework and low-rank decomposition. The method updates only ~5% of parameters. Evaluated on CLIP and DINO-ViT, it reduces trainable parameters by over 95% compared to standard LoRA, maintains or improves OOD accuracy, and significantly mitigates forgetting of prior-task knowledge. To our knowledge, this is the first work within the parameter-efficient fine-tuning (PEFT) paradigm to systematically enhance both OOD robustness and long-term knowledge retention.

Technology Category

Application Category

📝 Abstract
Adapting deep learning models to new domains often requires computationally intensive retraining and risks catastrophic forgetting. While fine-tuning enables domain-specific adaptation, it can reduce robustness to distribution shifts, impacting out-of-distribution (OOD) performance. Pre-trained zero-shot models like CLIP offer strong generalization but may suffer degraded robustness after fine-tuning. Building on Task Adaptive Parameter Sharing (TAPS), we propose a simple yet effective extension as a parameter-efficient fine-tuning (PEFT) method, using an indicator function to selectively activate Low-Rank Adaptation (LoRA) blocks. Our approach minimizes knowledge loss, retains its generalization strengths under domain shifts, and significantly reduces computational costs compared to traditional fine-tuning. We demonstrate that effective fine-tuning can be achieved with as few as 5% of active blocks, substantially improving efficiency. Evaluations on pre-trained models such as CLIP and DINO-ViT demonstrate our method's broad applicability and effectiveness in maintaining performance and knowledge retention.
Problem

Research questions and friction points this paper is trying to address.

Continual Learning
Knowledge Retention
Resource Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

TAPS-based Fine-tuning
LoRA Selective Activation
Efficient Adaptation
🔎 Similar Papers
No similar papers found.