🤖 AI Summary
In federated learning, hardware heterogeneity across devices causes stragglers to impede global training convergence; existing acceleration strategies—such as client selection, asynchronous updates, and local training—often compromise model accuracy or introduce gradient bias. To address this, we propose an elastic learning framework that enables piecewise model training via a sliding-window mechanism and incorporates a dynamic tensor importance selection module to adaptively prune non-critical parameters on resource-constrained devices, jointly optimizing local update quality and global convergence efficiency. Furthermore, the framework introduces runtime budget allocation and tensor importance alignment to mitigate biases arising from dual heterogeneity in data and device capabilities. Extensive experiments demonstrate that our approach achieves a 3.87× improvement in time-accuracy trade-off over state-of-the-art baselines while maintaining comparable test accuracy.
📝 Abstract
Federated learning (FL) enables distributed devices to collaboratively train machine learning models while maintaining data privacy. However, the heterogeneous hardware capabilities of devices often result in significant training delays, as straggler clients with limited resources prolong the aggregation process. Existing solutions such as client selection, asynchronous FL, and partial training partially address these challenges but encounter issues such as reduced accuracy, stale updates, and compromised model performance due to inconsistent training contributions. To overcome these limitations, we propose FedEL, a federated elastic learning framework that enhances training efficiency while maintaining model accuracy. FedEL introduces a novel window-based training process, sliding the window to locate the training part of the model and dynamically selecting important tensors for training within a coordinated runtime budget. This approach ensures progressive and balanced training across all clients, including stragglers. Additionally, FedEL employs a tensor importance adjustment module, harmonizing local and global tensor importance to mitigate biases caused by data heterogeneity. The experiment results show that FedEL achieves up to 3.87x improvement in time-to-accuracy compared to baselines while maintaining or exceeding final test accuracy.