🤖 AI Summary
This work addresses two critical bottlenecks in federated learning (FL) with Low-Rank Adaptation (LoRA): (1) bias introduced by naïve averaging of LoRA matrices on the server, and (2) client-side initialization inconsistency across rounds, causing update lag. We propose a synergistic federated LoRA framework featuring a novel server-side gradient correction term—the first of its kind—that simultaneously mitigates aggregation bias and aligns client LoRA initializations, thereby ensuring global update consistency and theoretical convergence. The method is architecture-agnostic, compatible with ViT, MLP-Mixer, and other mainstream vision backbones. Evaluated on large-scale benchmarks, it significantly outperforms state-of-the-art (SOTA) methods in both accuracy and convergence stability, while maintaining high communication and computational efficiency.
📝 Abstract
Foundation models (FMs) achieve strong performance across diverse tasks with task-specific fine-tuning, yet full parameter fine-tuning is often computationally prohibitive for large models. Parameter-efficient fine-tuning (PEFT) methods like Low-Rank Adaptation (LoRA) reduce this cost by introducing low-rank matrices for tuning fewer parameters. While LoRA allows for efficient fine-tuning, it requires significant data for adaptation, making Federated Learning (FL) an appealing solution due to its privacy-preserving collaborative framework. However, combining LoRA with FL introduces two key challenges: the extbf{Server-Side Aggregation Bias}, where server-side averaging of LoRA matrices diverges from the ideal global update, and the extbf{Client-Side Initialization Lag}, emphasizing the need for consistent initialization across rounds. Existing approaches address these challenges individually, limiting their effectiveness. We propose LoRA-FAIR, a novel method that tackles both issues by introducing a correction term on the server, enhancing aggregation efficiency and accuracy. LoRA-FAIR maintains computational and communication efficiency, yielding superior performance over state-of-the-art methods. Experimental results on ViT and MLP-Mixer models across large-scale datasets demonstrate that LoRA-FAIR consistently achieves performance improvements in FL settings.