DySTop

📅 2025-08-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address model staleness and high communication overhead in Asynchronous Decentralized Federated Learning (ADFL) under heterogeneous dynamic edge environments, this paper proposes a novel joint optimization framework integrating dynamic staleness control with phase-aware topology construction. We design a worker activation strategy and an adaptive topology construction algorithm to mitigate gradient delay and reduce redundant transmissions while preserving convergence guarantees. Theoretical analysis establishes convergence under non-IID data distributions. Experiments demonstrate that, compared to state-of-the-art methods, our approach reduces training completion time by 51.8% and communication overhead by 57.1%, without sacrificing model accuracy. The core innovation lies in the co-modeling of dynamic staleness regulation and topology evolution, achieving Pareto improvements in training efficiency, communication cost, and model performance.

Technology Category

Application Category

📝 Abstract
Federated Learning (FL) has emerged as a potential distributed learning paradigm that enables model training on edge devices (i.e., workers) while preserving data privacy. However, its reliance on a centralized server leads to limited scalability. Decentralized federated learning (DFL) eliminates the dependency on a centralized server by enabling peer-to-peer model exchange. Existing DFL mechanisms mainly employ synchronous communication, which may result in training inefficiencies under heterogeneous and dynamic edge environments. Although a few recent asynchronous DFL (ADFL) mechanisms have been proposed to address these issues, they typically yield stale model aggregation and frequent model transmission, leading to degraded training performance on non-IID data and high communication overhead. To overcome these issues, we present DySTop, an innovative mechanism that jointly optimizes dynamic staleness control and topology construction in ADFL. In each round, multiple workers are activated, and a subset of their neighbors is selected to transmit models for aggregation, followed by local training. We provide a rigorous convergence analysis for DySTop, theoretically revealing the quantitative relationships between the convergence bound and key factors such as maximum staleness, activating frequency, and data distribution among workers. From the insights of the analysis, we propose a worker activation algorithm (WAA) for staleness control and a phase-aware topology construction algorithm (PTCA) to reduce communication overhead and handle data non-IID. Extensive evaluations through both large-scale simulations and real-world testbed experiments demonstrate that our DySTop reduces completion time by 51.8% and the communication resource consumption by 57.1% compared to state-of-the-art solutions, while maintaining the same model accuracy.
Problem

Research questions and friction points this paper is trying to address.

Addresses inefficiency in decentralized federated learning under dynamic edge environments
Reduces stale model aggregation and high communication overhead in ADFL
Optimizes dynamic staleness control and topology construction for non-IID data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic staleness control in ADFL
Phase-aware topology construction algorithm
Worker activation for efficient training
🔎 Similar Papers
No similar papers found.
Y
Yizhou Shi
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, Jiangsu 210094, China
Qianpiao Ma
Qianpiao Ma
Nanjing University of Science and Technology
Federated LearningEdge Computing
Y
Yan Xu
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, Jiangsu 210094, China
Junlong Zhou
Junlong Zhou
Nanjing University of Science and Technology
Embedded SystemsEdge-Cloud ComputingCyber Physical Systems
M
Ming Hu
School of Computing and Information Systems, Singapore Management University, Singapore
Yunming Liao
Yunming Liao
University of Science and Technology of China
Edge IntelligenceEdge ComputingFederated LearningSplit Federated Learning
Hongli Xu
Hongli Xu
University of Science and Technology of China
Software Defined NetworkCooperative CommunicationSensor Networks