π€ AI Summary
Existing stealthy backdoor attacks rely on white-box/black-box model access or auxiliary data, limiting their practicality. To address this, we propose ReVeilβthe first unconstrained stealthy backdoor attack injected during data collection, requiring neither model access nor additional data. Its core innovation is the integration of machine unlearning: during pre-deployment, trigger samples induce unlearning to suppress the attack success rate (ASR) to β€6.5%, enabling evasion of three major detection paradigms; upon deployment, forgetting reversal restores ASR to >92%. ReVeil combines trigger pattern injection with robustness across multiple datasets and diverse trigger designs. We validate its efficacy on four benchmark datasets and four distinct trigger patterns. Crucially, ReVeil is the first to synergistically leverage machine unlearning for both backdoor activation and stealth, thereby overcoming key practicality bottlenecks in stealthy backdoor attacks.
π Abstract
Backdoor attacks embed hidden functionalities in deep neural networks (DNN), triggering malicious behavior with specific inputs. Advanced defenses monitor anomalous DNN inferences to detect such attacks. However, concealed backdoors evade detection by maintaining a low pre-deployment attack success rate (ASR) and restoring high ASR post-deployment via machine unlearning. Existing concealed backdoors are often constrained by requiring white-box or black-box access or auxiliary data, limiting their practicality when such access or data is unavailable. This paper introduces ReVeil, a concealed backdoor attack targeting the data collection phase of the DNN training pipeline, requiring no model access or auxiliary data. ReVeil maintains low pre-deployment ASR across four datasets and four trigger patterns, successfully evades three popular backdoor detection methods, and restores high ASR post-deployment through machine unlearning.