MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning

📅 2024-06-13
🏛️ arXiv.org
📈 Citations: 13
Influential: 3
📄 PDF

career value

206K/year
🤖 AI Summary
Efficient fine-tuning of large language models (LLMs) often disrupts pretrained knowledge due to parameter interference. Method: We propose OrthoLoRA, which decomposes weight matrices via singular value decomposition (SVD) and freezes the dominant singular subspace—thereby preserving core knowledge—while restricting low-rank adaptation exclusively to its orthogonal complement subspace. This is the first approach to rigorously confine LoRA’s trainable parameters to the orthogonal complement space, decoupling knowledge retention from task-specific adaptation. Orthogonal low-rank initialization and SVD-driven parameter isolation further mitigate weight interference. Results: OrthoLoRA consistently outperforms standard LoRA and other parameter-efficient fine-tuning (PEFT) methods across diverse benchmarks—including commonsense reasoning, mathematical reasoning, instruction following, and vision-language instruction—while achieving superior parameter efficiency, memory savings, and performance gains.

Technology Category

Application Category

📝 Abstract
Efficient finetuning of large language models (LLMs) aims to adapt the LLMs with reduced computational and memory cost. Previous LoRA-based approaches initialize the low-rank matrices with Gaussian distribution and zero values while keeping the original weight matrices frozen. However, the trainable model parameters optimized in an unguided subspace might interfere with the well-learned subspace of the pretrained weight matrices. In this paper, we propose MiLoRA, a simple yet effective LLM finetuning approach that only updates the minor singular components of the weight matrix while keeping the principal singular components frozen. It is observed that the minor matrix corresponds to the noisy or long-tail information, while the principal matrix contains important knowledge. The MiLoRA initializes the low-rank matrices within a subspace that is orthogonal to the principal matrix, thus the pretrained knowledge is expected to be well preserved. During finetuning, MiLoRA makes the most use of the less-optimized subspace for learning the labeled dataset. Extensive experiments on commonsense reasoning, math reasoning, instruction following and visual instruction following benchmarks present the superior performance of our method.
Problem

Research questions and friction points this paper is trying to address.

Efficient finetuning of large language models with reduced computational cost.
Preserving pretrained knowledge by updating minor singular components only.
Improving performance on reasoning and instruction-following tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Updates minor singular components only
Initializes low-rank matrices orthogonally
Preserves pretrained knowledge effectively