🤖 AI Summary
This work proposes a decentralized federated learning framework to address the limitations of traditional approaches that rely on a single central server, which suffers from poor scalability and vulnerability to single-point failures. The proposed architecture comprises multiple interconnected servers, each coordinating a subset of clients. Through an alternating optimization strategy—alternating between local client training and inter-server global synchronization—the system achieves collaborative model refinement while preserving the classic client-server structure. A dedicated inter-server communication mechanism ensures model consistency across the network. Theoretical analysis demonstrates that, under appropriate conditions, all servers converge to a common solution with negligible deviation from the ideal model. Empirical evaluations confirm the algorithm’s effectiveness and enhanced scalability.
📝 Abstract
Federated learning is a privacy-focused approach towards machine learning where models are trained on client devices with locally available data and aggregated at a central server. However, the dependence on a single central server is challenging in the case of a large number of clients and even poses the risk of a single point of failure. To address these critical limitations of scalability and fault-tolerance, we present a distributed approach to federated learning comprising multiple servers with inter-server communication capabilities. While providing a fully decentralized approach, the designed framework retains the core federated learning structure where each server is associated with a disjoint set of clients with server-client communication capabilities. We propose a novel DFL (Distributed Federated Learning) algorithm which uses alternating periods of local training on the client data followed by global training among servers. We show that the DFL algorithm, under a suitable choice of parameters, ensures that all the servers converge to a common model value within a small tolerance of the ideal model, thus exhibiting effective integration of local and global training models. Finally, we illustrate our theoretical claims through numerical simulations.