Route-and-Aggregate Decentralized Federated Learning Under Communication Errors

📅 2025-03-28

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

To address the high flooding overhead and poor robustness of conventional gossip-based decentralized federated learning (D-FL) under unreliable communication, this paper proposes a Routing-and-Aggregation (R&A) D-FL framework. It enables directed model exchange—rather than network-wide flooding—via end-to-end packet error rate–driven optimal routing selection. Furthermore, it jointly models routing paths and aggregation weights, introducing an error-compensating adaptive normalized weighting mechanism. This work establishes, for the first time, a theoretical coupling between D-FL convergence and network-layer routing characteristics. Experiments demonstrate that, on a 10-client network, R&A-D-FL achieves 35% higher accuracy than standard gossip D-FL. When scaled to a 28-node network with bit errors, its performance approaches that of error-free centralized FL. The framework significantly improves both convergence speed and robustness in heterogeneous, lossy networks.

Technology Category

Application Category

📝 Abstract

Decentralized federated learning (D-FL) allows clients to aggregate learning models locally, offering flexibility and scalability. Existing D-FL methods use gossip protocols, which are inefficient when not all nodes in the network are D-FL clients. This paper puts forth a new D-FL strategy, termed Route-and-Aggregate (R&A) D-FL, where participating clients exchange models with their peers through established routes (as opposed to flooding) and adaptively normalize their aggregation coefficients to compensate for communication errors. The impact of routing and imperfect links on the convergence of R&A D-FL is analyzed, revealing that convergence is minimized when routes with the minimum end-to-end packet error rates are employed to deliver models. Our analysis is experimentally validated through three image classification tasks and two next-word prediction tasks, utilizing widely recognized datasets and models. R&A D-FL outperforms the flooding-based D-FL method in terms of training accuracy by 35% in our tested 10-client network, and shows strong synergy between D-FL and networking. In another test with 10 D-FL clients, the training accuracy of R&A D-FL with communication errors approaches that of the ideal C-FL without communication errors, as the number of routing nodes (i.e., nodes that do not participate in the training of D-FL) rises to 28.

Problem

Research questions and friction points this paper is trying to address.

Improving decentralized federated learning efficiency under communication errors

Optimizing model aggregation with adaptive routing and normalization

Enhancing training accuracy in non-ideal network conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Route-and-Aggregate decentralized federated learning strategy

Adaptive normalization for aggregation coefficients

Optimized routing minimizes convergence impact

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

Authors to Follow