π€ AI Summary
In federated learning, communication constraints limit the number of clients sampled per round, leading to slow convergence and high gradient variance. To address this, we formulate client selection as an online learning problem with bandit feedbackβthe first such formalization in the literature. We propose an adaptive sampling algorithm based on Online Stochastic Mirror Descent (OSMD), theoretically proving its regret bound strictly improves upon uniform sampling and demonstrating inherent variance reduction. Experiments on synthetic and real-world datasets show that our method significantly accelerates model convergence while achieving superior stability and efficiency compared to uniform sampling and existing online selection schemes. Moreover, it generalizes seamlessly to stochastic optimization frameworks including SGD and stochastic coordinate descent. The core contribution lies in rigorously casting client sampling as a bandit-style online learning problem and providing the first OSMD-based adaptive solution with provable theoretical guarantees.
π Abstract
Due to the high cost of communication, federated learning (FL) systems need to sample a subset of clients that are involved in each round of training. As a result, client sampling plays an important role in FL systems as it affects the convergence rate of optimization algorithms used to train machine learning models. Despite its importance, there is limited work on how to sample clients effectively. In this paper, we cast client sampling as an online learning task with bandit feedback, which we solve with an online stochastic mirror descent (OSMD) algorithm designed to minimize the sampling variance. We then theoretically show how our sampling method can improve the convergence speed of federated optimization algorithms over the widely used uniform sampling. Through both simulated and real data experiments, we empirically illustrate the advantages of the proposed client sampling algorithm over uniform sampling and existing online learning-based sampling strategies. The proposed adaptive sampling procedure is applicable beyond the FL problem studied here and can be used to improve the performance of stochastic optimization procedures such as stochastic gradient descent and stochastic coordinate descent.