SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated Learning

📅 2024-09-23

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Federated learning (FL) for NLP faces severe threats from stealthy and persistent backdoor attacks; existing methods fail on large language models (e.g., GPT-2) and lack robustness across training rounds and against defenses. This paper proposes SDBA: a Stealthy and Durable Backdoor Attack framework that identifies attack-susceptible layers in LSTM/GPT-2 via layer-wise sensitivity analysis, then applies intra-layer gradient masking and top-k% sparse gradient clipping—ensuring high stealth during both training and inference. Client-side local poisoning ensures strong attack persistence. SDBA is the first backdoor attack for FL-NLP that simultaneously achieves *stealth* (evading detection) and *durability* (withstanding cross-round dynamics and defenses including Krum, RFA, and norm clipping). It maintains >92% attack success rate against FedAvg aggregation in next-token prediction and sentiment analysis tasks, and remains effective for over 50 rounds on GPT-2.

Technology Category

Application Category

📝 Abstract

Federated Learning is a promising approach for training machine learning models while preserving data privacy, but its distributed nature makes it vulnerable to backdoor attacks, particularly in NLP tasks while related research remains limited. This paper introduces SDBA, a novel backdoor attack mechanism designed for NLP tasks in FL environments. Our systematic analysis across LSTM and GPT-2 models identifies the most vulnerable layers for backdoor injection and achieves both stealth and long-lasting durability through layer-wise gradient masking and top-k% gradient masking within these layers. Experiments on next token prediction and sentiment analysis tasks show that SDBA outperforms existing backdoors in durability and effectively bypasses representative defense mechanisms, with notable performance in LLM such as GPT-2. These results underscore the need for robust defense strategies in NLP-based FL systems.

Problem

Research questions and friction points this paper is trying to address.

SDBA targets NLP backdoor vulnerabilities in federated learning

It identifies susceptible layers in LSTM and GPT-2 models

The attack bypasses defenses with gradient masking techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Layer-wise gradient masking for stealth

Top-k% gradient masking for durability

Targets vulnerable layers in LSTM, GPT-2

🔎 Similar Papers

No similar papers found.