LoRA-DA: Data-Aware Initialization for Low-Rank Adaptation via Asymptotic Analysis

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LoRA initialization methods suffer from data blindness and theoretical limitations: they either ignore target-domain data or rely solely on single-step gradients and overly restrictive isotropic assumptions, thereby limiting adaptation performance. This paper proposes the first data-aware LoRA initialization framework grounded in asymptotic analysis. By jointly modeling parameter bias and variance, it explicitly characterizes sampling uncertainty under random data selection and preserves anisotropic structure via a Fisher-gradient formulation. The method requires only a small number of target-domain samples, leveraging Fisher information estimation and low-rank optimization to compute the optimal initial LoRA parameters. Experiments across multiple benchmarks demonstrate substantial improvements in final accuracy, faster convergence, enhanced training stability, and greater robustness to rank misspecification—all with negligible initialization overhead.

Technology Category

Application Category

📝 Abstract
With the widespread adoption of LLMs, LoRA has become a dominant method for PEFT, and its initialization methods have attracted increasing attention. However, existing methods have notable limitations: many methods do not incorporate target-domain data, while gradient-based methods exploit data only at a shallow level by relying on one-step gradient decomposition, which remains unsatisfactory due to the weak empirical performance of the one-step fine-tuning model that serves as their basis, as well as the fact that these methods either lack a rigorous theoretical foundation or depend heavily on restrictive isotropic assumptions. In this paper, we establish a theoretical framework for data-aware LoRA initialization based on asymptotic analysis. Starting from a general optimization objective that minimizes the expectation of the parameter discrepancy between the fine-tuned and target models, we derive an optimization problem with two components: a bias term, which is related to the parameter distance between the fine-tuned and target models, and is approximated using a Fisher-gradient formulation to preserve anisotropy; and a variance term, which accounts for the uncertainty introduced by sampling stochasticity through the Fisher information. By solving this problem, we obtain an optimal initialization strategy for LoRA. Building on this theoretical framework, we develop an efficient algorithm, LoRA-DA, which estimates the terms in the optimization problem from a small set of target domain samples and obtains the optimal LoRA initialization. Empirical results across multiple benchmarks demonstrate that LoRA-DA consistently improves final accuracy over existing initialization methods. Additional studies show faster, more stable convergence, robustness across ranks, and only a small initialization overhead for LoRA-DA. The source code will be released upon publication.
Problem

Research questions and friction points this paper is trying to address.

Establishes theoretical framework for data-aware LoRA initialization via asymptotic analysis
Derives optimal initialization strategy addressing bias-variance tradeoff in fine-tuning
Develops efficient algorithm using target domain data to improve adaptation accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Establishes theoretical framework via asymptotic analysis
Derives optimal initialization using Fisher-gradient and Fisher information
Develops efficient algorithm estimating terms from target samples
🔎 Similar Papers
No similar papers found.
Qingyue Zhang
Qingyue Zhang
Tsinghua University
machine learning
C
Chang Chu
Tsinghua Shenzhen International Graduate School, Tsinghua University
T
Tianren Peng
Tsinghua Shenzhen International Graduate School, Tsinghua University
Q
Qi Li
Tsinghua Shenzhen International Graduate School, Tsinghua University
X
Xiangyang Luo
Tsinghua Shenzhen International Graduate School, Tsinghua University
Z
Zhihao Jiang
Tsinghua Shenzhen International Graduate School, Tsinghua University
Shao-Lun Huang
Shao-Lun Huang
T-SIGS, Tsinghua University
Information TheoryMachine learning