Mitigating Backdoor Attacks in Federated Learning via Flipping Weight Updates of Low-Activation Input Neurons

📅 2024-08-16
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In federated learning, malicious clients can inject backdoors during local training—exploiting the opacity of client-side updates—to compromise model security. To address this, we propose FLAIN, a defense method that identifies critical neurons exhibiting high activation on backdoor inputs but low activation on clean data, using auxiliary datasets. FLAIN dynamically inverts the sign of weight updates for these neurons during aggregation to suppress backdoor activation. This work introduces the first neuron-activation-driven adaptive weight inversion mechanism, requiring no model architecture modification, no access to client data, and remaining effective under non-IID data distributions and high fractions of malicious clients. Extensive experiments demonstrate that FLAIN reduces success rates of diverse backdoor attacks to below 5%, while incurring less than 0.5% degradation in clean accuracy—outperforming state-of-the-art defenses.

Technology Category

Application Category

📝 Abstract
Federated learning enables multiple clients to collaboratively train machine learning models under the overall planning of the server while adhering to privacy requirements. However, the server cannot directly oversee the local training process, creating an opportunity for malicious clients to introduce backdoors. Existing research shows that backdoor attacks activate specific neurons in the compromised model, which remain dormant when processing clean data. Leveraging this insight, we propose a method called Flipping Weight Updates of Low-Activation Input Neurons (FLAIN) to defend against backdoor attacks in federated learning. Specifically, after completing global training, we employ an auxiliary dataset to identify low-activation input neurons and flip the associated weight updates. We incrementally raise the threshold for low-activation inputs and flip the weight updates iteratively, until the performance degradation on the auxiliary data becomes unacceptable. Extensive experiments validate that our method can effectively reduce the success rate of backdoor attacks to a low level in various attack scenarios including those with non-IID data distribution or high MCRs, causing only minimal performance degradation on clean data.
Problem

Research questions and friction points this paper is trying to address.

Mitigating backdoor attacks in federated learning via neuron analysis
Identifying low-activation neurons to flip malicious weight updates
Maintaining clean data performance while reducing backdoor success rates
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flipping weight updates of low-activation neurons
Using auxiliary dataset to identify malicious neurons
Iteratively adjusting activation threshold for defense
🔎 Similar Papers
No similar papers found.
B
Binbin Ding
College of Computer Science and Technology/Artificial Intelligence, Nanjing University of Aeronautics and Astronautics
Penghui Yang
Penghui Yang
CCDS, Nanyang Technological University
Machine Learning
Z
Zeqing Ge
College of Computer Science and Technology/Artificial Intelligence, Nanjing University of Aeronautics and Astronautics
Sheng-Jun Huang
Sheng-Jun Huang
Nanjing University of Aeronautics & Astronautics
Machine Learning