🤖 AI Summary
Federated learning is vulnerable to poisoning attacks, and even after removing malicious updates, their residual effects can still degrade global model performance. To address this, this work proposes FAUN, a lightweight federated unlearning framework that introduces adversarial optimization into the federated unlearning process for the first time. FAUN retains recent updates from malicious clients to generate adversarial compensation updates on a proxy dataset, then combines a small number of unlearning rounds with benign fine-tuning to efficiently restore model performance. Requiring only a short window of historical information, FAUN achieves recovery performance nearly equivalent to retraining from scratch across three standard benchmarks, significantly reduces communication overhead, and suppresses attack success rates to near zero.
📝 Abstract
Federated learning (FL) is vulnerable to poisoning attacks, where malicious clients upload manipulated updates to degrade the performance of the global model. Although detection methods can identify and remove malicious clients, the model remains affected. Retraining from scratch is effective but costly, and existing unlearning methods remain unsatisfactory in both effectiveness and efficiency. We propose Federated Adversarial Unlearning (FAUN), a lightweight framework that retains only a short window of malicious clients' updates and employs adversarial optimization on a proxy dataset to derive updates that eliminate malicious directions. Applying these updates for a few unlearning rounds, followed by benign fine-tuning, enables fast removal of malicious effects and stable recovery. Experiments on three canonical datasets show that FAUN achieves recovery comparable to retraining while requiring far fewer rounds and reduces attack success rates to near zero, confirming FAUN successfully eliminates the contributions of unlearned clients.