π€ AI Summary
To address privacy preservation, resource and data heterogeneity, and parameter aggregation bias in fine-tuning large language models (LLMs) under federated learning, this paper proposes HLoRAβa novel LoRA variant supporting dynamic rank adaptation. Its core innovation is the first-ever rank-aware heterogeneous parameter aggregation mechanism, which dynamically allocates LoRA ranks per client based on local computational capacity and data volume, followed by weighted aggregation to mitigate gradient bias induced by fixed-rank configurations. Integrated into the Plato framework, HLoRA is evaluated across MRPC, QQP, and RTE benchmarks. Under identical resource constraints, it achieves a 23% faster convergence rate and a 1.8-percentage-point improvement in final accuracy over standard LoRA. The method significantly enhances communication efficiency, computational fairness, and model convergence consistency across heterogeneous edge devices.
π Abstract
Federated learning systems have been identified as an efficient approach to scaling distributed model training with a large amount of participants or data owners while guaranteeing data privacy. To apply the current most popular pre-trained large language models to other domains with data privacy guarantee requirements, existing works propose fine-tuning the pre-trained large language models in federated learning environments across data owners using the parameter efficient fine-tuning approaches, LoRA. To address the resource and data heterogeneous issues for the participants, previous works adopted heterogeneous LoRA using different ranks for different clients and pending their rank, which brings bias for the parameter aggregation. To address this issue, we propose HLoRA, an efficient federated learning system utilizing a modified LoRA approach that incorporates rank heterogeneity to optimize communication and computational efficiency. Experimental results, conducted using the Microsoft Research Paraphrase Corpus (MRPC), Quora Question Pairs (QQP) and Recognizing Textual Entailment (RTE), within the Plato federated learning framework, demonstrate that our method not only reduces resource demands but also outperforms traditional LoRA applications in terms of convergence speed and final model accuracy. This study shows that our approach can significantly improve the practical deployment of federated LLM fine-tuning, particularly in environments with diverse client resources.