FLoRG: Federated Fine-tuning with Low-rank Gram Matrices and Procrustes Alignment

📅 2026-02-19

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work addresses the aggregation error and decomposition drift in federated learning with Low-Rank Adaptation (LoRA), which arise from the separate aggregation of its two low-rank matrices. To mitigate these issues, the authors propose a novel federated fine-tuning framework based on a single low-rank matrix, integrating Gram matrix aggregation with Procrustes alignment for the first time. This approach effectively eliminates drift caused by non-unique matrix decompositions while substantially reducing communication overhead. Experimental results demonstrate that the proposed framework consistently outperforms five state-of-the-art methods across multiple large language model fine-tuning benchmarks, achieving higher downstream task accuracy and reducing communication costs by up to 2,041-fold.

Technology Category

Application Category

📝 Abstract

Parameter-efficient fine-tuning techniques such as low-rank adaptation (LoRA) enable large language models (LLMs) to adapt to downstream tasks efficiently. Federated learning (FL) further facilitates this process by enabling collaborative fine-tuning across distributed clients without sharing private data. However, the use of two separate low-rank matrices in LoRA for federated fine-tuning introduces two types of challenges. The first challenge arises from the error induced by separately aggregating those two low-rank matrices. The second challenge occurs even when the product of two low-rank matrices is aggregated. The server needs to recover factors via matrix decomposition, which is non-unique and can introduce decomposition drift. To tackle the aforementioned challenges, we propose FLoRG, a federated fine-tuning framework which employs a single low-rank matrix for fine-tuning and aggregates its Gram matrix (i.e., the matrix of inner products of its column vectors), eliminating the aggregation error while also reducing the communication overhead. FLoRG minimizes the decomposition drift by introducing a Procrustes alignment approach which aligns the decomposed matrix between consecutive fine-tuning rounds for consistent updates. We theoretically analyze the convergence of FLoRG and prove that adopting the Procrustes alignment results in a tighter convergence bound. Experimental results across multiple LLM fine-tuning benchmarks demonstrate that FLoRG outperforms five state-of-the-art baseline schemes in the downstream task accuracy and can reduce the communication overhead by up to 2041$\times$.

Problem

Research questions and friction points this paper is trying to address.

Federated Learning

Low-rank Adaptation

Matrix Decomposition

Aggregation Error

Decomposition Drift

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning

Low-rank Adaptation

Gram Matrix