Riemannian Optimization for LoRA on the Stiefel Manifold

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In LoRA-based fine-tuning of large language models (LLMs), AdamW optimization induces parameter redundancy, rank degradation, and poor orthogonality in the B matrix. Method: This work introduces Riemannian optimization on the Stiefel manifold to LoRA for the first time, imposing explicit orthogonality constraints on the B matrix—thereby eliminating geometric redundancy, ensuring full effective rank, and achieving near-perfect orthogonality. The method integrates manifold-aware gradient updates with adaptive learning rates, preserving parameter efficiency while substantially improving training stability and convergence speed. Contribution/Results: Evaluated across multiple benchmark tasks, our approach consistently outperforms both AdamW-LoRA and DoRA. Empirical results demonstrate that geometric regularization is critical to unlocking LoRA’s theoretical potential, establishing a novel paradigm for efficient LLM fine-tuning.

Technology Category

Application Category

📝 Abstract
While powerful, large language models (LLMs) present significant fine-tuning challenges due to their size. Parameter-efficient fine-tuning (PEFT) methods like LoRA provide solutions, yet suffer from critical optimizer inefficiencies; notably basis redundancy in LoRA's $B$ matrix when using AdamW, which fundamentally limits performance. We address this by optimizing the $B$ matrix on the Stiefel manifold, imposing explicit orthogonality constraints that achieve near-perfect orthogonality and full effective rank. This geometric approach dramatically enhances parameter efficiency and representational capacity. Our Stiefel optimizer consistently outperforms AdamW across benchmarks with both LoRA and DoRA, demonstrating that geometric constraints are the key to unlocking LoRA's full potential for effective LLM fine-tuning.
Problem

Research questions and friction points this paper is trying to address.

Addressing basis redundancy in LoRA's B matrix optimization
Imposing orthonormal constraints via Stiefel manifold geometry
Enhancing parameter efficiency and representational capacity in LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizing B matrix on Stiefel manifold
Imposing explicit orthogonality constraints
Enhancing parameter efficiency geometrically
🔎 Similar Papers
No similar papers found.
J
Juneyoung Park
Opt-AI Inc.
Minjae Kang
Minjae Kang
Seoul National University
Reinforcement LearningRobotic Manipulation
S
Seongbae Lee
Opt-AI Inc.
H
Haegang Lee
Opt-AI Inc.
S
Seongwan Kim
Opt-AI Inc.
J
Jaeho Lee
Opt-AI Inc.