🤖 AI Summary
This work addresses decentralized asynchronous federated learning in serverless edge AI settings, relaxing strong conventional assumptions of data homogeneity, global synchronization, and uniform update rules—assumptions ill-suited for dynamic, resource-constrained wireless edge networks.
Method: We propose a decoupled communication-computation scheduling mechanism enabling fully autonomous device participation, continuous local training, and fault-tolerant straggler management. Our framework integrates continuous-time gossip protocols, asynchronous stochastic gradient descent (SGD), and row-stochastic network topology modeling.
Contribution/Results: We establish rigorous convergence guarantees under non-i.i.d. data and time-varying topologies. Experiments demonstrate significant improvements in training stability and convergence speed compared to state-of-the-art baselines. The approach provides a novel paradigm for collaborative learning across heterogeneous edge devices without centralized coordination or strict synchronization, advancing practical deployment of federated learning in real-world wireless edge environments.
📝 Abstract
Recent developments and emerging use cases, such as smart Internet of Things (IoT) and Edge AI, have sparked considerable interest in the training of neural networks over fully decentralized (serverless) networks. One of the major challenges of decentralized learning is to ensure stable convergence without resorting to strong assumptions applied for each agent regarding data distributions or updating policies. To address these issues, we propose DRACO, a novel method for decentralized asynchronous Stochastic Gradient Descent (SGD) over row-stochastic gossip wireless networks by leveraging continuous communication. Our approach enables edge devices within decentralized networks to perform local training and model exchanging along a continuous timeline, thereby eliminating the necessity for synchronized timing. The algorithm also features a specific technique of decoupling communication and computation schedules, which empowers complete autonomy for all users and manageable instructions for stragglers. Through a comprehensive convergence analysis, we highlight the advantages of asynchronous and autonomous participation in decentralized optimization. Our numerical experiments corroborate the efficacy of the proposed technique.