Stabilizing Decentralized Federated Fine-Tuning via Topology-Aware Alternating LoRA

📅 2026-01-31

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

This work addresses a critical instability issue in decentralized federated learning with Low-Rank Adaptation (LoRA): the factorized structure of LoRA introduces topology-dependent cross terms during aggregation under dynamic communication topologies, leading to divergent training behavior. To mitigate this, the authors propose TAD-LoRA, a topology-aware framework that coordinates local LoRA factor updates and cross-node aggregation through an alternating block coordinate descent mechanism, effectively controlling representation misalignment across clients. TAD-LoRA is the first method to explicitly identify and resolve the topology-induced cross-term problem inherent to LoRA in decentralized settings, and it provides convergence guarantees under non-convex objectives. Experimental results demonstrate that TAD-LoRA significantly outperforms existing baselines across diverse communication topologies—particularly in weakly connected scenarios—and achieves state-of-the-art performance on the MNLI dataset.

Technology Category

Application Category

📝 Abstract

Decentralized federated learning (DFL), a serverless variant of federated learning, poses unique challenges for parameter-efficient fine-tuning due to the factorized structure of low-rank adaptation (LoRA). Unlike linear parameters, decentralized aggregation of LoRA updates introduces topology-dependent cross terms that can destabilize training under dynamic communication graphs. We propose \texttt{TAD-LoRA}, a Topology-Aware Decentralized Low-Rank Adaptation framework that coordinates the updates and mixing of LoRA factors to control inter-client misalignment. We theoretically prove the convergence of \texttt{TAD-LoRA} under non-convex objectives, explicitly characterizing the trade-off between topology-induced cross-term error and block-coordinate representation bias governed by the switching interval of alternative training. Experiments under various communication conditions validate our analysis, showing that \texttt{TAD-LoRA} achieves robust performance across different communication scenarios, remaining competitive in strongly connected topologies and delivering clear gains under moderately and weakly connected topologies, with particularly strong results on the MNLI dataset.

Problem

Research questions and friction points this paper is trying to address.

Decentralized Federated Learning

Low-Rank Adaptation

Topology-Dependent Cross Terms

Training Instability

Dynamic Communication Graphs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized Federated Learning

Low-Rank Adaptation

Topology-Aware Optimization