🤖 AI Summary
Backdoor attacks in federated learning often suffer from poor robustness due to static triggers and strong coupling between the primary and backdoor tasks, rendering them vulnerable to dilution by benign model updates and ineffective under common defenses. To address this, we propose a task-decoupled robust backdoor attack framework. Our method employs a dynamic min-max optimization: the inner loop maximizes the performance gap between poisoned and clean samples, while the outer loop adaptively generates semantically concealed, input-dependent triggers—thereby decoupling the primary and backdoor tasks. Additionally, we design a lightweight local model injection mechanism compatible with mainstream backdoor attack paradigms. Extensive experiments on CV and NLP benchmarks demonstrate that our approach consistently outperforms six state-of-the-art attacks across six representative defense mechanisms, achieving significant and consistent improvements in both attack success rate and persistence.
📝 Abstract
Federated learning allows multiple participants to collaboratively train a central model without sharing their private data. However, this distributed nature also exposes new attack surfaces. In particular, backdoor attacks allow attackers to implant malicious behaviors into the global model while maintaining high accuracy on benign inputs. Existing attacks usually rely on fixed patterns or adversarial perturbations as triggers, which tightly couple the main and backdoor tasks. This coupling makes them vulnerable to dilution by honest updates and limits their persistence under federated defenses. In this work, we propose an approach to decouple the backdoor task from the main task by dynamically optimizing the backdoor trigger within a min-max framework. The inner layer maximizes the performance gap between poisoned and benign samples, ensuring that the contributions of benign users have minimal impact on the backdoor. The outer process injects the adaptive triggers into the local model. We evaluate our method on both computer vision and natural language tasks, and compare it with six backdoor attack methods under six defense algorithms. Experimental results show that our method achieves good attack performance and can be easily integrated into existing backdoor attack techniques.