🤖 AI Summary
Federated learning (FL) on heterogeneous mobile devices suffers from high training latency, stale models, severe model drift, and straggler effects due to statistical, computational, and wireless communication heterogeneity.
Method: This paper proposes a joint optimization framework that co-designs gradient transmission and local computation overlap with dynamic participant selection (PS). It introduces a staleness upper bound constraint to ensure model freshness while limiting memory overhead. A staleness-aware scheduling strategy is formulated based on latency modeling and utility function optimization, complemented by an overlap-triggering metric to mitigate model drift.
Contribution/Results: Experiments demonstrate that the proposed method significantly reduces training latency with controllable memory cost. Compared to state-of-the-art baselines, it achieves substantial latency reduction while effectively alleviating straggler effects and model drift, thereby improving both convergence speed and model accuracy.
📝 Abstract
Training latency is critical for the success of numerous intrigued applications ignited by federated learning (FL) over heterogeneous mobile devices. By revolutionarily overlapping local gradient transmission with continuous local computing, FL can remarkably reduce its training latency over homogeneous clients, yet encounter severe model staleness, model drifts, memory cost and straggler issues in heterogeneous environments. To unleash the full potential of overlapping, we propose, FedEx, a novel underline{fed}erated learning approach to underline{ex}pedite FL training over mobile devices under data, computing and wireless heterogeneity. FedEx redefines the overlapping procedure with staleness ceilings to constrain memory consumption and make overlapping compatible with participation selection (PS) designs. Then, FedEx characterizes the PS utility function by considering the latency reduced by overlapping, and provides a holistic PS solution to address the straggler issue. FedEx also introduces a simple but effective metric to trigger overlapping, in order to avoid model drifts. Experimental results show that compared with its peer designs, FedEx demonstrates substantial reductions in FL training latency over heterogeneous mobile devices with limited memory cost.