🤖 AI Summary
This work addresses the convergence bias of Federated Averaging (FedAvg) in federated learning, caused by arbitrary client participation and data heterogeneity. We propose the first theoretical framework that reveals FedAvg’s implicit decentralized nature by modeling it as a stochastic matrix multiplication process. Building on this insight, we design FOCUS—a novel algorithm that integrates push-pull communication with Polyak–Łojasiewicz (PL) condition analysis—without requiring bounded heterogeneity assumptions. FOCUS achieves linear convergence under both strongly convex and PL nonconvex settings. Crucially, it is the first method to rigorously guarantee convergence to the global optimum under arbitrary client participation, thereby eliminating both data heterogeneity and participation bias. Our framework unifies centralized and decentralized perspectives on federated learning, establishing an interpretable and correctable paradigm for FL convergence analysis.
📝 Abstract
This work introduces a novel decentralized framework to interpret federated learning (FL) and, consequently, correct the biases introduced by arbitrary client participation and data heterogeneity, which are two typical traits in practical FL. Specifically, we first reformulate the core processes of FedAvg - client participation, local updating, and model aggregation - as stochastic matrix multiplications. This reformulation allows us to interpret FedAvg as a decentralized algorithm. Leveraging the decentralized optimization framework, we are able to provide a concise analysis to quantify the impact of arbitrary client participation and data heterogeneity on FedAvg's convergence point. This insight motivates the development of Federated Optimization with Exact Convergence via Push-pull Strategy (FOCUS), a novel algorithm inspired by the decentralized algorithm that eliminates these biases and achieves exact convergence without requiring the bounded heterogeneity assumption. Furthermore, we theoretically prove that FOCUS exhibits linear convergence (exponential decay) for both strongly convex and non-convex functions satisfying the Polyak-Lojasiewicz condition, regardless of the arbitrary nature of client participation.