BadFU: Backdoor Federated Learning through Adversarial Machine Unlearning

๐Ÿ“… 2025-08-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Federated learning (FL) faces a novel security threatโ€”machine unlearning mechanisms are vulnerable to backdoor attacks. Method: We propose the first backdoor attack specifically targeting federated unlearning, wherein an adversary injects backdoors during local client training by submitting malicious samples disguised as legitimate unlearning requests; the selective unlearning process then triggers covert behavioral shifts in the global model. Crucially, the attack requires no modification to the aggregation protocol and operates within standard FL frameworks (e.g., FedAvg, FedProx) and common unlearning strategies (e.g., EFK, SISA). Contribution/Results: Experiments demonstrate robust attack efficacy across diverse FL settings and unlearning schemes, significantly degrading primary-task accuracy while activating highly stealthy backdoors post-unlearning. This work uncovers a fundamental security flaw in current federated unlearning designs, introduces the first paradigm treating unlearning operations as attack vectors, and provides critical security insights and benchmarking tools for trustworthy federated unlearning.

Technology Category

Application Category

๐Ÿ“ Abstract
Federated learning (FL) has been widely adopted as a decentralized training paradigm that enables multiple clients to collaboratively learn a shared model without exposing their local data. As concerns over data privacy and regulatory compliance grow, machine unlearning, which aims to remove the influence of specific data from trained models, has become increasingly important in the federated setting to meet legal, ethical, or user-driven demands. However, integrating unlearning into FL introduces new challenges and raises largely unexplored security risks. In particular, adversaries may exploit the unlearning process to compromise the integrity of the global model. In this paper, we present the first backdoor attack in the context of federated unlearning, demonstrating that an adversary can inject backdoors into the global model through seemingly legitimate unlearning requests. Specifically, we propose BadFU, an attack strategy where a malicious client uses both backdoor and camouflage samples to train the global model normally during the federated training process. Once the client requests unlearning of the camouflage samples, the global model transitions into a backdoored state. Extensive experiments under various FL frameworks and unlearning strategies validate the effectiveness of BadFU, revealing a critical vulnerability in current federated unlearning practices and underscoring the urgent need for more secure and robust federated unlearning mechanisms.
Problem

Research questions and friction points this paper is trying to address.

Backdoor attack through adversarial unlearning in federated learning
Malicious client exploits unlearning requests to inject backdoors
Global model compromised via camouflage sample removal strategy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial unlearning requests inject backdoors
Malicious client uses camouflage samples training
Global model transitions to backdoored state
B
Bingguang Lu
University of Newcastle, Newcastle, NSW, Australia
Hongsheng Hu
Hongsheng Hu
Lecturer, School of Information and Physical Sciences, University of Newcastle
Trustworthy Machine LearningMachine Unlearning
Yuantian Miao
Yuantian Miao
University of Newcastle, NSW Australia
ML Security & PrivacyNetwork Traffic ClassificationNetwork Security
S
Shaleeza Sohail
University of Newcastle, Newcastle, NSW, Australia
C
Chaoxiang He
Shanghai Jiao Tong University, Shanghai, China
S
Shuo Wang
Shanghai Jiao Tong University, Shanghai, China
X
Xiao Chen
University of Newcastle, Newcastle, NSW, Australia