๐ค AI Summary
This work addresses the inaccuracy and steady-state error inherent in existing Byzantine-robust decentralized learning methods, which stem from bias introduced by robust aggregation. To overcome this limitation, the paper proposes DRSGD-ByMI, a novel framework employing a โdetect-and-reoptimizeโ strategy. It features a p-value-free identification mechanism based on sample-splitting score statistics, enabling effective control of the false discovery rate without strong distributional assumptions. The method actively removes malicious nodes and reconstructs the network topology to restore connectivity among honest agents. Theoretically, DRSGD-ByMI achieves the same optimal convergence rate as standard decentralized SGD even under Byzantine attacks. Experimental results further demonstrate its superior performance in both convergence accuracy and robustness.
๐ Abstract
To defend against Byzantine attacks in decentralized learning, most existing methods rely on robust aggregation rules to mitigate the influence of malicious machines. However, these strategies inherently introduce bias, leading to inexact convergence with non-vanishing steady-state errors. In this paper, we propose a strategic shift from passive aggregation to active identification by introducing the Decentralized Rescaled Stochastic Gradient Descent with Byzantine Machine Identification (DRSGD-ByMI) framework. The core of our approach is an identification-based ``detect-then-optimiz''pipeline, where a p-value-free detection procedure is developed to accurately prune malicious nodes from the network. By leveraging sample-splitting score statistics, this identification mechanism achieves false discovery rate control without requiring restrictive distributional assumptions. We theoretically demonstrate that this precise identification allows the decentralized network to recover sufficient connectivity among the normal nodes, thereby enabling DRSGD-ByMI to match, even in the presence of Byzantine machines, the same order-optimal convergence rate as standard decentralized stochastic first-order methods. Numerical experiments validate our theoretical results and demonstrate the effectiveness of DRSGD-ByMI for decentralized robust learning problems.