🤖 AI Summary
This paper addresses the significant performance gap of Low-Rank Adaptation (LoRA) compared to full-parameter fine-tuning, particularly in accuracy and convergence speed. We propose LoFT (Low-Rank Fine-Tuning), the first method to dynamically project both the first- and second-moment estimates of the Adam optimizer onto a low-rank subspace. This alignment of optimizer dynamics between low-rank and full-parameter updates enhances training stability and convergence efficiency—without introducing additional hyperparameters (e.g., scaling factor α). LoFT requires only standard low-rank matrix decomposition and subspace projection, incurring zero inference overhead. Extensive experiments demonstrate that LoFT consistently outperforms LoRA across diverse tasks and substantially narrows the performance gap between parameter-efficient fine-tuning and full fine-tuning.
📝 Abstract
Large pre-trained models are commonly adapted to downstream tasks using parameter-efficient fine-tuning methods such as Low-Rank Adaptation (LoRA), which injects small trainable low-rank matrices instead of updating all weights. While LoRA dramatically reduces trainable parameters with little overhead, it can still underperform full fine-tuning in accuracy and often converges more slowly. We introduce LoFT, a novel low-rank adaptation method that behaves like full fine-tuning by aligning the optimizer's internal dynamics with those of updating all model weights. LoFT not only learns weight updates in a low-rank subspace (like LoRA) but also properly projects the optimizer's first and second moments (Adam's momentum and variance) into the same subspace, mirroring full-model updates. By aligning the low-rank update itself with the full update, LoFT eliminates the need for tuning extra hyperparameters, e.g., LoRA scaling factor $alpha$. Empirically, this approach substantially narrows the performance gap between adapter-based tuning and full fine-tuning and consistently outperforms standard LoRA-style methods, all without increasing inference cost.