Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

📅 2025-02-03

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

To address the high communication and computational overhead in federated fine-tuning of large language models (LLMs), this paper proposes RoLoRA—a novel framework that introduces alternating optimization into federated LoRA tuning, sequentially updating the up-projection and down-projection matrices. This design significantly reduces client upload volume (by up to 67%) and local computation cost while enhancing model expressivity and robustness. Theoretically, we provide the first convergence analysis of alternating optimization for linear models in federated learning, establishing its positive impact on resilience to data heterogeneity and malicious updates. Extensive experiments across MNIST, RoBERTa-Large, and Llama-2-7B demonstrate that RoLoRA achieves an average accuracy improvement of 3.2% over state-of-the-art federated parameter-efficient fine-tuning (PEFT) methods, validating its effectiveness and generalizability across diverse model scales and tasks.

Technology Category

Application Category

📝 Abstract

Parameter-Efficient Fine-Tuning (PEFT) methods like Low-Rank Adaptation (LoRA) optimize federated training by reducing computational and communication costs. We propose RoLoRA, a federated framework using alternating optimization to fine-tune LoRA adapters. Our approach emphasizes the importance of learning up and down projection matrices to enhance expressiveness and robustness. We use both theoretical analysis and extensive experiments to demonstrate the advantages of RoLoRA over prior approaches that either generate imperfect model updates or limit expressiveness of the model. We present theoretical analysis on a simplified linear model to demonstrate the importance of learning both down-projection and up-projection matrices in LoRA. We provide extensive experimental evaluations on a toy neural network on MNIST as well as large language models including RoBERTa-Large, Llama-2-7B on diverse tasks to demonstrate the advantages of RoLoRA over other methods.

Problem

Research questions and friction points this paper is trying to address.

Optimize federated training efficiency

Enhance model expressiveness and robustness

Alternate optimization of LoRA adapters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Alternating optimization of LoRA

Learning projection matrices

Robust federated finetuning framework

🔎 Similar Papers

Federated Large Language Models: Current Progress and Future Directions