Towards Hyper-parameter-free Federated Learning

📅 2024-08-30
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Federated learning suffers from slow global convergence and strong hyperparameter dependence due to client heterogeneity and noisy local gradients. Method: This paper proposes a server-side hyperparameter-free automated global update scaling mechanism. It leverages adaptive step-size design and client-level descent analysis, approximating the intractable server objective with the mean of randomly sampled clients’ objective functions—enabling the first fully hyperparameter-free federated global update. Contribution/Results: We theoretically establish linear convergence under strong convexity and extend the analysis to non-convex settings. Empirical evaluation demonstrates performance on par with or superior to FedAvg across both convex and non-convex tasks, while significantly improving robustness and usability. The method advances the practical deployment of hyperparameter-agnostic federated learning.

Technology Category

Application Category

📝 Abstract
The adaptive synchronization techniques in federated learning (FL) for scaled global model updates show superior performance over the vanilla federated averaging (FedAvg) scheme. However, existing methods employ additional tunable hyperparameters on the server to determine the scaling factor. A contrasting approach is automated scaling analogous to tuning-free step-size schemes in stochastic gradient descent (SGD) methods, which offer competitive convergence rates and exhibit good empirical performance. In this work, we introduce two algorithms for automated scaling of global model updates. In our first algorithm, we establish that a descent-ensuring step-size regime at the clients ensures descent for the server objective. We show that such a scheme enables linear convergence for strongly convex federated objectives. Our second algorithm shows that the average of objective values of sampled clients is a practical and effective substitute for the objective function value at the server required for computing the scaling factor, whose computation is otherwise not permitted. Our extensive empirical results show that the proposed methods perform at par or better than the popular federated learning algorithms for both convex and non-convex problems. Our work takes a step towards designing hyper-parameter-free federated learning.
Problem

Research questions and friction points this paper is trying to address.

Taming client heterogeneity in federated optimization
Achieving linear convergence with partial client participation
Improving empirical performance through server learning rate extrapolation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stochastic line search tames client heterogeneity
Extrapolated server learning rate improves performance
Achieves linear convergence with partial client participation
🔎 Similar Papers
No similar papers found.
G
Geetika
Department of Computer Science & Engineering, IIIT-Delhi, New Delhi, India
D
Drishya Uniyal
Department of Computer Science & Engineering, IIIT-Delhi, New Delhi, India
Bapi Chatterjee
Bapi Chatterjee
IIIT-Delhi
concurrent data structuresdistributed machine learningfederated learning