LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This paper addresses the significant performance gap of Low-Rank Adaptation (LoRA) compared to full-parameter fine-tuning, particularly in accuracy and convergence speed. We propose LoFT (Low-Rank Fine-Tuning), the first method to dynamically project both the first- and second-moment estimates of the Adam optimizer onto a low-rank subspace. This alignment of optimizer dynamics between low-rank and full-parameter updates enhances training stability and convergence efficiency—without introducing additional hyperparameters (e.g., scaling factor α). LoFT requires only standard low-rank matrix decomposition and subspace projection, incurring zero inference overhead. Extensive experiments demonstrate that LoFT consistently outperforms LoRA across diverse tasks and substantially narrows the performance gap between parameter-efficient fine-tuning and full fine-tuning.

Technology Category

Application Category

📝 Abstract

Large pre-trained models are commonly adapted to downstream tasks using parameter-efficient fine-tuning methods such as Low-Rank Adaptation (LoRA), which injects small trainable low-rank matrices instead of updating all weights. While LoRA dramatically reduces trainable parameters with little overhead, it can still underperform full fine-tuning in accuracy and often converges more slowly. We introduce LoFT, a novel low-rank adaptation method that behaves like full fine-tuning by aligning the optimizer's internal dynamics with those of updating all model weights. LoFT not only learns weight updates in a low-rank subspace (like LoRA) but also properly projects the optimizer's first and second moments (Adam's momentum and variance) into the same subspace, mirroring full-model updates. By aligning the low-rank update itself with the full update, LoFT eliminates the need for tuning extra hyperparameters, e.g., LoRA scaling factor $alpha$. Empirically, this approach substantially narrows the performance gap between adapter-based tuning and full fine-tuning and consistently outperforms standard LoRA-style methods, all without increasing inference cost.

Problem

Research questions and friction points this paper is trying to address.

LoFT improves LoRA's accuracy to match full fine-tuning

LoFT aligns optimizer dynamics with full weight updates

LoFT reduces hyperparameter tuning without inference cost increase

Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligns optimizer dynamics with full fine-tuning

Projects optimizer moments into low-rank subspace

Eliminates need for tuning extra hyperparameters

🔎 Similar Papers

No similar papers found.