ConsNoTrainLoRA: Data-driven Weight Initialization of Low-rank Adapters using Constraints

📅 2025-07-09

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

To address the slow convergence and suboptimal performance of LoRA in large-model fine-tuning—caused by random weight initialization—this paper proposes a training-free, data-driven initialization method. The core insight is to formulate LoRA initialization as a domain adaptation problem, where activation distribution alignment between pretraining and fine-tuning stages is explicitly enforced; this yields a closed-form solution for directly estimating adapter weights. Moreover, the method supports rank-adaptive decomposition of the up/down projection matrices, enhancing parameterization flexibility. Crucially, it requires neither additional training nor backpropagation—only forward-pass activation statistics and singular value decomposition. Evaluated across image generation, classification, and understanding tasks, the approach accelerates convergence by 1.8× on average and outperforms standard LoRA and existing data-driven initialization methods by +2.3% in average task performance.

Technology Category

Application Category

📝 Abstract

Foundation models are pre-trained on large-scale datasets and subsequently fine-tuned on small-scale datasets using parameter-efficient fine-tuning (PEFT) techniques like low-rank adapters (LoRA). In most previous works, LoRA weight matrices are randomly initialized with a fixed rank across all attachment points. In this paper, we improve convergence and final performance of LoRA fine-tuning, using our proposed data-driven weight initialization method, ConsNoTrainLoRA (CNTLoRA). We express LoRA initialization as a domain shift problem where we use multiple constraints relating the pre-training and fine-tuning activations. By reformulating these constraints, we obtain a closed-form estimate of LoRA weights that depends on pre-training weights and fine-tuning activation vectors and hence requires no training during initialization. This weight estimate is decomposed to initialize the up and down matrices with proposed flexibility of variable ranks. With the proposed initialization method, we fine-tune on downstream tasks such as image generation, image classification and image understanding. Both quantitative and qualitative results demonstrate that CNTLoRA outperforms standard and data-driven weight initialization methods. Extensive analyses and ablations further elucidate the design choices of our framework, providing an optimal recipe for faster convergence and enhanced performance.

Problem

Research questions and friction points this paper is trying to address.

Improves LoRA fine-tuning convergence and performance

Solves domain shift in LoRA weight initialization

Enables variable rank initialization without training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-driven LoRA weight initialization method

Closed-form estimate without training

Variable ranks for up and down matrices

🔎 Similar Papers

LoRTA: Low Rank Tensor Adaptation of Large Language Models