π€ AI Summary
This paper addresses the problem of Byzantine clients injecting malicious model updates in federated learning, which causes global model divergence. We propose a novel anomaly update filtering mechanism that integrates dynamic trust scoring with a probing function. To our knowledge, this is the first method provably convergent even when Byzantine clients constitute a majority (>50%)βa previously unsolved challenge. The approach is compatible with standard local training, partial client participation, and adaptive optimizers such as Adam and RMSProp. Rigorous theoretical analysis establishes convergence guarantees under heterogeneous settings. Furthermore, the algorithm is designed for broad adaptability across diverse system and statistical heterogeneities. Extensive experiments on synthetic benchmarks and real-world medical ECG datasets demonstrate that our method achieves significantly higher robustness against strong Byzantine attacks than state-of-the-art baselines, while matching the convergence speed and accuracy of classical federated algorithms in benign (attack-free) environments.
π Abstract
Recent advancements in machine learning have improved performance while also increasing computational demands. While federated and distributed setups address these issues, their structure is vulnerable to malicious influences. In this paper, we address a specific threat, Byzantine attacks, where compromised clients inject adversarial updates to derail global convergence. We combine the trust scores concept with trial function methodology to dynamically filter outliers. Our methods address the critical limitations of previous approaches, allowing functionality even when Byzantine nodes are in the majority. Moreover, our algorithms adapt to widely used scaled methods like Adam and RMSProp, as well as practical scenarios, including local training and partial participation. We validate the robustness of our methods by conducting extensive experiments on both synthetic and real ECG data collected from medical institutions. Furthermore, we provide a broad theoretical analysis of our algorithms and their extensions to aforementioned practical setups. The convergence guarantees of our methods are comparable to those of classical algorithms developed without Byzantine interference.