Rethinking Parameter Sharing for LLM Fine-Tuning with Multiple LoRAs

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

In multi-task and federated learning settings, existing LoRA-based approaches suffer from inefficient parameter sharing across multiple adapters, redundant training of matrix A, and underutilization of matrix B. Method: This paper proposes ALoRA and its federated extension, Fed-ALoRA, which theoretically identify matrix B as the primary carrier of knowledge encoding and transfer, while matrix A captures task-specific characteristics. Accordingly, it introduces an asymmetric sharing architecture: a single shared matrix B paired with multiple task- or client-specific matrices A, further enhanced by heterogeneous rank adaptation to improve generalization under client and task heterogeneity. The method integrates low-rank decomposition, multi-task learning, and federated optimization. Results: Extensive experiments on commonsense reasoning, mathematical reasoning, and multi-task/federated NLP benchmarks demonstrate that ALoRA and Fed-ALoRA achieve accuracy comparable to or surpassing state-of-the-art methods, while yielding more balanced performance across tasks.

Technology Category

Application Category

📝 Abstract

Large language models are often adapted using parameter-efficient techniques such as Low-Rank Adaptation (LoRA), formulated as $y = W_0x + BAx$, where $W_0$ is the pre-trained parameters and $x$ is the input to the adapted layer. While multi-adapter extensions often employ multiple LoRAs, prior studies suggest that the inner $A$ matrices are highly similar during training and thus suitable for sharing. We revisit this phenomenon and find that this similarity is largely attributable to the identical initialization rather than shared knowledge, with $B$ playing a more critical role in knowledge encoding and transfer. Motivated by these insights, we propose extbf{ALoRA}, an asymmetric multi-LoRA design with multiple $A$ matrices and a single shared $B$ in multi-task fine-tuning, and extbf{Fed-ALoRA}, which shares $B$ across clients in federated fine-tuning under both homogeneous and heterogeneous settings, through a novel matrix decomposition strategy to accommodate heterogeneous ranks across clients. Experiments on commonsense reasoning, math reasoning, multi-task NLP dataset, and federated NLP dataset demonstrate that our methods achieve more balanced performance across tasks with comparable or superior average accuracy relative to existing multi-LoRA approaches. Codes are available at https://github.com/OptMN-Lab/ALoRA.

Problem

Research questions and friction points this paper is trying to address.

Reevaluating parameter sharing in multi-LoRA fine-tuning

Proposing asymmetric LoRA design for multi-task adaptation

Enabling federated fine-tuning with shared B matrices

Innovation

Methods, ideas, or system contributions that make the work stand out.

Shared B matrix in multi-task fine-tuning

Asymmetric LoRA design with multiple A matrices

Matrix decomposition for federated heterogeneous ranks

🔎 Similar Papers

ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation