π€ AI Summary
This work addresses the limitations of traditional online federated learning, which struggles to leverage parallelism and lacks effective modeling of data heterogeneity across clients and over time. The authors propose the FedSEA framework, introducing a Stochastic Extension with Adversary (SEA) model that assumes a fixed loss function while allowing an adversary to independently select each clientβs data distribution per round. They design the algoOFL algorithm, combining client-side online stochastic gradient descent with periodic server-side aggregation. For the first time in online federated learning, they quantify how spatial and temporal heterogeneity jointly affect network regret, revealing that parallelism can reduce regret when temporal variation is mild. Under smooth convex and strongly convex losses, they establish regret bounds of πͺ(βT) and πͺ(log T), respectively, surpassing prior pessimistic worst-case analyses.
π Abstract
Online federated learning (OFL) has emerged as a popular framework for decentralized decision-making over continuous data streams without compromising client privacy. However, the adversary model assumed in standard OFL typically precludes any potential benefits of parallelization. Further, it fails to adequately capture the different sources of statistical variation in OFL problems. In this paper, we extend the OFL paradigm by integrating a stochastically extended adversary (SEA). Under this framework, the loss function remains fixed across clients over time. However, the adversary dynamically and independently selects the data distribution for each client at each time. We propose the \algoOFL{} algorithm to solve this problem, which utilizes online stochastic gradient descent at the clients, along with periodic global aggregation via the server. We establish bounds on the global network regret over a time horizon \(T\) for two classes of functions: (1) for smooth and convex losses, we prove an \(\mathcal{O}(\sqrt{T})\) bound, and (2) for smooth and strongly convex losses, we prove an \(\mathcal{O}(\log T)\) bound. Through careful analysis, we quantify the individual impact of both spatial (across clients) and temporal (over time) data heterogeneity on the regret bounds. Consequently, we identify a regime of mild temporal variation (relative to stochastic gradient variance), where the network regret improves with parallelization. Hence, in the SEA setting, our results improve the existing pessimistic worst-case results in online federated learning.