Commmunication-Efficient and Accurate Approach for Aggregation in Federated Low-Rank Adaptation

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

FedLoRA faces two key bottlenecks in federated fine-tuning: (1) large local-global generalization gaps due to imprecise client updates, and (2) high communication overhead. To address these, we propose FLoRA-NA, which introduces a proxy aggregation matrix on the server. Leveraging only the low-rank LoRA parameters uploaded by clients, FLoRA-NA performs efficient and accurate aggregation via gradient approximation and matrix reconstruction—without requiring additional communication. This enables the server to closely approximate the global update, substantially narrowing the performance gap between personalized local models and the generalized global model. Extensive experiments across diverse tasks—including natural language understanding, mathematical reasoning, and code generation—and multiple foundation models demonstrate that FLoRA-NA consistently achieves state-of-the-art global performance while incurring minimal communication cost (transmitting LoRA weights only), thereby offering both computational efficiency and strong generalization capability.

Technology Category

Application Category

📝 Abstract

With the rapid emergence of foundation models and the increasing need for fine-tuning across distributed environments, Federated Low-Rank Adaptation (FedLoRA) has recently gained significant attention. Despite enormous potential, current FedLoRA methods face notable challenges due to inexact updates. Existing approaches have attempted to mitigate this issue, but they often introduce a emph{local-global generalization gap} and incur emph{substantial communication overhead}, limiting their scalability and effectiveness. To address these limitations, we propose extbf{F}ederated extbf{Lo}w- extbf{R}ank extbf{A}ggregation with extbf{N}early extbf{A}ccurate Estimation (FLoRA-NA). FLoRA-NA leverages the local LoRA matrices on the server to estimate the aggregated matrices $hat{A}$ and $hat{B}$, which are then distributed to clients for local updates. This surrogated aggregated matrices minimizes the divergence between ideal $ abla Bar{W} = sum^{U}_{u=1}B_u A_u$ and practical updates $ abla hat{W} = hat{B}hat{A}$ without adding communication cost beyond vanilla FedLoRA. By doing so, FLoRA-NA achieves communication efficiency and bridges the gap between local personalization and global generalization, addressing a key limitation of prior personalized FedLoRA approaches. We conduct extensive evaluations across diverse tasks, including natural language understanding, mathematical reasoning, and code-solving ability using various foundation models. Experimental results consistently demonstrate that FLoRA-NA achieves state-of-the-art global performance while maintaining low communication overhead.

Problem

Research questions and friction points this paper is trying to address.

Addresses inexact updates in Federated Low-Rank Adaptation

Reduces communication overhead while maintaining accuracy

Bridges local personalization and global generalization gap

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages local LoRA matrices for server-side estimation

Minimizes divergence between ideal and practical updates

Achieves communication efficiency without added overhead

🔎 Similar Papers

Communication-Efficient Federated Low-Rank Update Algorithm and its Connection to Implicit Regularization