ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation

📅 2024-06-16

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

199K/year

🤖 AI Summary

To address the challenge of simultaneously achieving parameter efficiency, adaptability, and robustness in large language model (LLM) fine-tuning, this paper proposes ShareLoRA—a highly efficient fine-tuning method that shares low-rank adaptation matrices across transformer layers. Its core innovation is the first introduction of inter-layer weight sharing, integrated with hierarchical low-rank decomposition and gradient-coordinated updates, ensuring compatibility with mainstream architectures including RoBERTa, GPT-2, and LLaMA. Compared to standard LoRA, ShareLoRA reduces trainable parameters by 44%–96% and significantly lowers memory overhead. On the GSM8K continual learning benchmark, it achieves up to a 1.2% absolute accuracy improvement. Moreover, it systematically outperforms baselines across zero-shot, few-shot, and cross-domain generalization settings—providing the first empirical validation of the broad efficacy of shared structural priors in multi-paradigm transfer learning.

Technology Category

Application Category

📝 Abstract

In this paper, we introduce extbf{Share}d extbf{Lo}w extbf{R}ank extbf{A}daptation (ShareLoRA), a Large Language Model (LLM) fine-tuning technique that balances parameter efficiency, adaptability, and robustness without compromising performance. By strategically sharing the low-rank weight matrices across different layers, ShareLoRA achieves 44% to 96% reduction in trainable parameters compared to standard LoRA, alongside a substantial decrease in memory overhead. This efficiency gain scales with model size, making ShareLoRA particularly advantageous for resource-constrained environments. Importantly, ShareLoRA not only maintains model performance but also exhibits robustness in both classification and generation tasks across diverse models, including RoBERTa, GPT-2, and LLaMA series (1, 2, and 3). It consistently outperforms LoRA in zero-shot, few-shot, and continual fine-tuning scenarios, achieving up to 1.2% average accuracy improvement, and enhanced generalization across domains. In continual learning settings, ShareLoRA achieves 1.2% higher accuracy on GSM8K, 0.6% on HumanEval, and 0.5% on both MMLU and MMLU-Pro. Our results demonstrate that ShareLoRA supports high-quality fine-tuning while offering strong generalization and continual adaptation across various model scales and diverse tasks.

Problem

Research questions and friction points this paper is trying to address.

Efficient LLM fine-tuning with shared low-rank adaptation

Reduces trainable parameters and memory overhead significantly

Maintains performance and robustness across diverse tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Shares low-rank matrices across layers

Reduces trainable parameters significantly

Maintains performance and enhances robustness

🔎 Similar Papers

LoRTA: Low Rank Tensor Adaptation of Large Language Models