🤖 AI Summary
Federated learning suffers from slow global convergence and strong hyperparameter dependence due to client heterogeneity and noisy local gradients.
Method: This paper proposes a server-side hyperparameter-free automated global update scaling mechanism. It leverages adaptive step-size design and client-level descent analysis, approximating the intractable server objective with the mean of randomly sampled clients’ objective functions—enabling the first fully hyperparameter-free federated global update.
Contribution/Results: We theoretically establish linear convergence under strong convexity and extend the analysis to non-convex settings. Empirical evaluation demonstrates performance on par with or superior to FedAvg across both convex and non-convex tasks, while significantly improving robustness and usability. The method advances the practical deployment of hyperparameter-agnostic federated learning.
📝 Abstract
The adaptive synchronization techniques in federated learning (FL) for scaled global model updates show superior performance over the vanilla federated averaging (FedAvg) scheme. However, existing methods employ additional tunable hyperparameters on the server to determine the scaling factor. A contrasting approach is automated scaling analogous to tuning-free step-size schemes in stochastic gradient descent (SGD) methods, which offer competitive convergence rates and exhibit good empirical performance. In this work, we introduce two algorithms for automated scaling of global model updates. In our first algorithm, we establish that a descent-ensuring step-size regime at the clients ensures descent for the server objective. We show that such a scheme enables linear convergence for strongly convex federated objectives. Our second algorithm shows that the average of objective values of sampled clients is a practical and effective substitute for the objective function value at the server required for computing the scaling factor, whose computation is otherwise not permitted. Our extensive empirical results show that the proposed methods perform at par or better than the popular federated learning algorithms for both convex and non-convex problems. Our work takes a step towards designing hyper-parameter-free federated learning.