🤖 AI Summary
In federated learning, FedAvg is highly vulnerable to stealthy backdoor attacks launched by a large fraction of malicious clients (up to 90%), while existing defenses struggle to balance robustness and practicality—often requiring server-side auxiliary datasets. To address this, we propose FL-PLAS, a robust, auxiliary-data-free defense based on partial-layer aggregation: the server aggregates only the feature extractor parameters, while preserving clients’ local classifiers—thereby architecturally decoupling benign and malicious components and blocking backdoor label propagation. Extensive experiments across three image benchmarks demonstrate that FL-PLAS significantly outperforms six state-of-the-art defenses. Crucially, it maintains near-original main-task accuracy while reducing backdoor activation rates to nearly zero—even under attacks from 90% malicious clients—achieving, for the first time, effective defense against such extreme attack settings.
📝 Abstract
Federated learning (FL) is gaining increasing attention as an emerging collaborative machine learning approach, particularly in the context of large-scale computing and data systems. However, the fundamental algorithm of FL, Federated Averaging (FedAvg), is susceptible to backdoor attacks. Although researchers have proposed numerous defense algorithms, two significant challenges remain. The attack is becoming more stealthy and harder to detect, and current defense methods are unable to handle 50% or more malicious users or assume an auxiliary server dataset. To address these challenges, we propose a novel defense algorithm, FL-PLAS, extbf{F}ederated extbf{L}earning based on extbf{P}artial extbf{ L}ayer extbf{A}ggregation extbf{S}trategy. In particular, we divide the local model into a feature extractor and a classifier. In each iteration, the clients only upload the parameters of a feature extractor after local training. The server then aggregates these local parameters and returns the results to the clients. Each client retains its own classifier layer, ensuring that the backdoor labels do not impact other clients. We assess the effectiveness of FL-PLAS against state-of-the-art (SOTA) backdoor attacks on three image datasets and compare our approach to six defense strategies. The results of the experiment demonstrate that our methods can effectively protect local models from backdoor attacks. Without requiring any auxiliary dataset for the server, our method achieves a high main-task accuracy with a lower backdoor accuracy even under the condition of 90% malicious users with the attacks of trigger, semantic and edge-case.