Dual Decomposition of Weights and Singular Value Low Rank Adaptation

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

Standard LoRA suffers from unstable fine-tuning and inefficient transfer of pretrained knowledge due to random initialization of adapter parameters. To address this, we propose SVD-guided magnitude-direction dual decomposition initialization: LoRA weights are decoupled into scalar magnitude and unit-matrix direction components, with the direction initialized in a principled manner using the singular value decomposition (SVD) of pretrained weights. This is the first work to integrate both weight dual decomposition and SVD-based initialization into LoRA. The method significantly enhances optimization stability, better preserves original representational capacity—particularly beneficial for domain-specific tasks—and improves generalization. On MMLU and GSM8K, it achieves 48.35% and 62.53% ± 1.59 accuracy, respectively, outperforming standard LoRA with faster convergence and stronger generalization.

Technology Category

Application Category

📝 Abstract

Parameter-Efficient Fine-Tuning (PEFT) has emerged as a critical paradigm for adapting Large Language Models (LLMs) to downstream tasks, among which Low-rank Adaptation (LoRA) represents one of the most widely adopted methodologies. However, existing LoRA-based approaches exhibit two fundamental limitations: unstable training dynamics and inefficient knowledge transfer from pre-trained models, both stemming from random initialization of adapter parameters. To overcome these challenges, we propose DuDe, a novel approach that decomposes weight matrices into magnitude and direction components, employing Singular Value Decomposition (SVD) for principled initialization. Our comprehensive evaluation demonstrates DuDe's superior performance and robustness, achieving up to 48.35% accuracy on MMLU and 62.53% ($pm$ 1.59) accuracy on GSM8K. Our theoretical analysis and empirical validation collectively demonstrate that DuDe's decomposition strategy enhances optimization stability and better preserves pre-trained representations, particularly for domain-specific tasks requiring specialized knowledge. The combination of robust empirical performance and rigorous theoretical foundations establishes DuDe as a significant contribution to PEFT methodologies for LLMs.

Problem

Research questions and friction points this paper is trying to address.

Addresses unstable training in LoRA-based PEFT methods

Improves knowledge transfer from pre-trained models

Enhances optimization stability for domain-specific tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposes weight matrices into magnitude and direction

Uses SVD for principled initialization

Enhances optimization stability and preserves representations

🔎 Similar Papers

No similar papers found.