🤖 AI Summary
This work addresses the issue of catastrophic forgetting of pre-trained knowledge when applying Low-Rank Adaptation (LoRA) to downstream tasks. The authors propose initializing LoRA weights using Principal Component Analysis (PCA) and systematically investigate how fine-tuning different principal components affects the trade-off between task performance and knowledge retention. They find that adapting intermediate principal components achieves a more effective balance between these objectives. This strategy demonstrates particular robustness under high learning rates and in continual learning scenarios. Extensive experiments across diverse vision and natural language processing tasks show that the proposed method significantly improves accuracy while effectively mitigating forgetting, outperforming baseline approaches that fine-tune only the leading or trailing principal components.
📝 Abstract
Low-Rank Adaptation (LoRA) methods have emerged as crucial techniques for adapting large pre-trained models to downstream tasks under computational and memory constraints. However, they face a fundamental challenge in balancing task-specific performance gains against catastrophic forgetting of pre-trained knowledge, where existing methods provide inconsistent recommendations. This paper presents a comprehensive analysis of the performance-forgetting trade-offs inherent in low-rank adaptation using principal components as initialization. Our investigation reveals that fine-tuning intermediate components leads to better balance and show more robustness to high learning rates than first (PiSSA) and last (MiLoRA) components in existing work. Building on these findings, we provide a practical approach for initialization of LoRA that offers superior trade-offs. We demonstrate in a thorough empirical study on a variety of computer vision and NLP tasks that our approach improves accuracy and reduces forgetting, also in continual learning scenarios.