Distributed Federated Learning by Alternating Periods of Training

📅 2026-01-05

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

This work proposes a decentralized federated learning framework to address the limitations of traditional approaches that rely on a single central server, which suffers from poor scalability and vulnerability to single-point failures. The proposed architecture comprises multiple interconnected servers, each coordinating a subset of clients. Through an alternating optimization strategy—alternating between local client training and inter-server global synchronization—the system achieves collaborative model refinement while preserving the classic client-server structure. A dedicated inter-server communication mechanism ensures model consistency across the network. Theoretical analysis demonstrates that, under appropriate conditions, all servers converge to a common solution with negligible deviation from the ideal model. Empirical evaluations confirm the algorithm’s effectiveness and enhanced scalability.

Technology Category

Application Category

📝 Abstract

Federated learning is a privacy-focused approach towards machine learning where models are trained on client devices with locally available data and aggregated at a central server. However, the dependence on a single central server is challenging in the case of a large number of clients and even poses the risk of a single point of failure. To address these critical limitations of scalability and fault-tolerance, we present a distributed approach to federated learning comprising multiple servers with inter-server communication capabilities. While providing a fully decentralized approach, the designed framework retains the core federated learning structure where each server is associated with a disjoint set of clients with server-client communication capabilities. We propose a novel DFL (Distributed Federated Learning) algorithm which uses alternating periods of local training on the client data followed by global training among servers. We show that the DFL algorithm, under a suitable choice of parameters, ensures that all the servers converge to a common model value within a small tolerance of the ideal model, thus exhibiting effective integration of local and global training models. Finally, we illustrate our theoretical claims through numerical simulations.

Problem

Research questions and friction points this paper is trying to address.

federated learning

scalability

fault-tolerance

central server

distributed system

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed Federated Learning

Decentralized Architecture

Alternating Training