FedEx: Expediting Federated Learning over Heterogeneous Mobile Devices by Overlapping and Participant Selection

📅 2024-07-01
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Federated learning (FL) on heterogeneous mobile devices suffers from high training latency, stale models, severe model drift, and straggler effects due to statistical, computational, and wireless communication heterogeneity. Method: This paper proposes a joint optimization framework that co-designs gradient transmission and local computation overlap with dynamic participant selection (PS). It introduces a staleness upper bound constraint to ensure model freshness while limiting memory overhead. A staleness-aware scheduling strategy is formulated based on latency modeling and utility function optimization, complemented by an overlap-triggering metric to mitigate model drift. Contribution/Results: Experiments demonstrate that the proposed method significantly reduces training latency with controllable memory cost. Compared to state-of-the-art baselines, it achieves substantial latency reduction while effectively alleviating straggler effects and model drift, thereby improving both convergence speed and model accuracy.

Technology Category

Application Category

📝 Abstract
Training latency is critical for the success of numerous intrigued applications ignited by federated learning (FL) over heterogeneous mobile devices. By revolutionarily overlapping local gradient transmission with continuous local computing, FL can remarkably reduce its training latency over homogeneous clients, yet encounter severe model staleness, model drifts, memory cost and straggler issues in heterogeneous environments. To unleash the full potential of overlapping, we propose, FedEx, a novel underline{fed}erated learning approach to underline{ex}pedite FL training over mobile devices under data, computing and wireless heterogeneity. FedEx redefines the overlapping procedure with staleness ceilings to constrain memory consumption and make overlapping compatible with participation selection (PS) designs. Then, FedEx characterizes the PS utility function by considering the latency reduced by overlapping, and provides a holistic PS solution to address the straggler issue. FedEx also introduces a simple but effective metric to trigger overlapping, in order to avoid model drifts. Experimental results show that compared with its peer designs, FedEx demonstrates substantial reductions in FL training latency over heterogeneous mobile devices with limited memory cost.
Problem

Research questions and friction points this paper is trying to address.

Reducing FL training latency on heterogeneous mobile devices
Addressing model staleness and memory cost in FL
Optimizing participant selection to mitigate straggler issues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Overlapping local gradient transmission with computing
Redefining overlapping with staleness ceilings
Holistic participant selection solution for stragglers
🔎 Similar Papers
No similar papers found.
Jiaxiang Geng
Jiaxiang Geng
The University of Hong Kong, Beijing University of Posts and Telecommunications
Federated LearningFoundation ModelMobile ComputingIntegrated Sensing and Communication
B
Boyu Li
Beijing University of Posts and Telecommunications, Beijing, China
Xiaoqi Qin
Xiaoqi Qin
Beijing University of Posts and Telecommunications
Y
Yixuan Li
Beijing University of Posts and Telecommunications, Beijing, China
L
Liang Li
Peng Cheng Laboratory, Shenzhen, China
Y
Yanzhao Hou
Beijing University of Posts and Telecommunications, Beijing, China
Miao Pan
Miao Pan
Professor, Electrical and Computer Engineering, University of Houston
Wireless for AICybersecurity for AIMobile/Edge AI SystemsUnderwater IoT Nets