🤖 AI Summary
This work investigates aggregator robustness against label-flipping poisoning attacks in distributed heterogeneous learning. Addressing multi-source heterogeneous data, we theoretically establish—for the first time—that under sufficient heterogeneity, the standard mean aggregator achieves order-optimal learning error and outperforms mainstream robust aggregators (e.g., Krum, Median) in poisoning resilience—a finding that challenges the conventional wisdom that robust aggregators are inherently superior. Methodologically, we integrate distributed optimization modeling, formal characterization of label-flipping attacks, and theoretical analysis of statistical heterogeneity. Through rigorous error-bound derivation and extensive experiments across diverse heterogeneity settings, we demonstrate that the mean aggregator consistently surpasses existing robust alternatives, achieving both theoretical optimality and empirical effectiveness.
📝 Abstract
Robustness to malicious attacks is of paramount importance for distributed learning. Existing works usually consider the classical Byzantine attacks model, which assumes that some workers can send arbitrarily malicious messages to the server and disturb the aggregation steps of the distributed learning process. To defend against such worst-case Byzantine attacks, various robust aggregators have been proposed. They are proven to be effective and much superior to the often-used mean aggregator. In this paper, however, we demonstrate that the robust aggregators are too conservative for a class of weak but practical malicious attacks, as known as label poisoning attacks, where the sample labels of some workers are poisoned. Surprisingly, we are able to show that the mean aggregator is more robust than the state-of-the-art robust aggregators in theory, given that the distributed data are sufficiently heterogeneous. In fact, the learning error of the mean aggregator is proven to be order-optimal in this case. Experimental results corroborate our theoretical findings, showing the superiority of the mean aggregator under label poisoning attacks.