🤖 AI Summary
Federated learning (FL) on edge devices faces dual challenges of resource constraints and system heterogeneity, leading to insufficient local updates, slow convergence, and high sensitivity to hyperparameter tuning—particularly the learning rate. To address these issues, we propose Gradient-free “Guess-and-Learn” (GeL), a novel mechanism that replaces gradient-based optimization with gradient-agnostic parameter guessing and adaptive local optimization, enabling dynamic adaptation to fluctuating computational resources. Furthermore, we design a modular, plug-and-play framework supporting seamless integration of state-of-the-art FL algorithms—including FedProx, FedNova, and FedYogi. Extensive experiments demonstrate that GeL accelerates convergence by up to 40% under resource-limited conditions, significantly reduces reliance on meticulous learning rate tuning, and achieves superior robustness across heterogeneous device environments—without compromising efficiency or accuracy.
📝 Abstract
Federated learning (FL) enables a set of client devices to collaboratively train a model without sharing raw data. This process, though, operates under the constrained computation and communication resources of edge devices. These constraints combined with systems heterogeneity force some participating clients to perform fewer local updates than expected by the server, thus slowing down convergence. Exhaustive tuning of hyperparameters in FL, furthermore, can be resource-intensive, without which the convergence is adversely affected. In this work, we propose GeL, the guess and learn algorithm. GeL enables constrained edge devices to perform additional learning through guessed updates on top of gradient-based steps. These guesses are gradientless, i.e., participating clients leverage them for free. Our generic guessing algorithm (i) can be flexibly combined with several state-of-the-art algorithms including FedProx, FedNova or FedYogi; and (ii) achieves significantly improved performance when the learning rates are not best tuned. We conduct extensive experiments and show that GeL can boost empirical convergence by up to 40% in resource-constrained networks while relieving the need for exhaustive learning rate tuning.