🤖 AI Summary
To address the dual challenges of Byzantine attacks and data heterogeneity in federated learning—compromising both robustness and convergence—this paper proposes the Robust Average Gradient Aggregation (RAGA) algorithm, integrating geometric median-based aggregation with an adaptive local update mechanism. We provide the first rigorous convergence analysis of RAGA under non-convex and strongly convex loss functions with heterogeneous data, proving convergence when the fraction of Byzantine clients is below 1/2; moreover, the convergence bound approaches the optimal rate as data heterogeneity diminishes. Theoretically, RAGA achieves a convergence rate of $O(1/T^{2/3-delta})$ for non-convex objectives and linear convergence for strongly convex ones. Extensive experiments demonstrate that RAGA significantly outperforms existing baselines across diverse Byzantine attack scenarios, simultaneously attaining strong robustness and efficient convergence.
📝 Abstract
This paper deals with federated learning (FL) in the presence of malicious Byzantine attacks and data heterogeneity. A novel Robust Average Gradient Algorithm (RAGA) is proposed, which leverages the geometric median for aggregation and can freely select the round number for local updating. Different from most existing resilient approaches, which perform convergence analysis based on strongly-convex loss function or homogeneously distributed dataset, we conduct convergence analysis for not only strongly-convex but also non-convex loss function over heterogeneous dataset. According to our theoretical analysis, as long as the fraction of dataset from malicious users is less than half, RAGA can achieve convergence at rate $mathcal{O}({1}/{T^{2/3- delta}})$ where $T$ is the iteration number and $delta in (0, 2/3)$ for non-convex loss function, and at linear rate for strongly-convex loss function. Moreover, stationary point or global optimal solution is proved to obtainable as data heterogeneity vanishes. Experimental results corroborate the robustness of RAGA to Byzantine attacks and verifies the advantage of RAGA over baselines on convergence performance under various intensity of Byzantine attacks, for heterogeneous dataset.