PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

To address the limited representational capacity of Low-Rank Adaptation (LoRA) in large language model fine-tuning, this paper proposes PrunedLoRA—a gradient-driven structured pruning framework that dynamically extracts high-expressivity, adaptively rank-allocated low-rank adapters from an over-parameterized initialization space. It is the first to introduce theoretically grounded, gradient-sensitivity-based pruning into LoRA, providing a provable upper bound on pruning error and demonstrating superior robustness to weight perturbations compared to activation-driven methods. PrunedLoRA supports fine-grained pruning and parameter freezing, effectively preventing reactivation of redundant components. Experiments show that PrunedLoRA significantly outperforms LoRA and its variants across mathematical reasoning, code generation, and natural language understanding tasks. Moreover, it achieves state-of-the-art performance under multiple sparsity levels, exhibiting strong generalization and robustness.

Technology Category

Application Category

📝 Abstract

Low-rank adaptation (LoRA) has become a widely used paradigm for parameter-efficient fine-tuning of large language models, yet its representational capacity often lags behind full fine-tuning. Within the context of LoRA, a key open question is how to obtain expressive low-rank adapters from over-parameterized spaces. We propose extit{PrunedLoRA}, a new framework that leverages structured pruning to obtain highly representative low-rank adapters from an over-parameterized initialization. Unlike prior approaches that impose a fixed low-rank budget, PrunedLoRA dynamically prunes less important components during fine-tuning and prevents their reactivation, enabling flexible and adaptive rank allocation. For structured pruning, by minimizing the pruning error for overall loss, we provide fine-grained pruning and recovery updates in a gradient-based pruning strategy with grounded interpretation. We provide the first theoretical analysis of the robustness of structured pruning and provably show that under the impact of weight perturbation, gradient-based pruning is more robust than activation-based pruning with respect to overall loss. Empirically, PrunedLoRA consistently outperforms LoRA and its variants across supervised fine-tuning tasks in mathematical reasoning, code generation, and natural language understanding, and it also demonstrates advantages over existing structured pruning methods across diverse sparsity levels.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LoRA's representational capacity through structured pruning techniques

Dynamically pruning components to enable flexible rank allocation during fine-tuning

Providing theoretical robustness analysis for gradient-based structured pruning methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses structured pruning for low-rank adapters

Dynamically prunes components during fine-tuning

Employs gradient-based pruning with theoretical robustness

🔎 Similar Papers

No similar papers found.