🤖 AI Summary
Existing clean-label backdoor attacks face a fundamental trade-off between stealthiness—measured by minimal degradation in clean accuracy (CA)—and attack effectiveness. To address this, we propose Generative Clean-Label Backdoor (GCB), the first framework leveraging conditional InfoGAN to learn semantically coherent and disentangled trigger features directly from natural images. GCB integrates minimal poisoning—requiring fewer than 1% poisoned samples—with feature-decoupled training to substantially reduce trigger interference with the primary task. Evaluated across six benchmark datasets, five model architectures, and four task types—including classification, regression, and segmentation—GCB achieves ≤1% CA drop while maintaining high attack success rates and robustness against state-of-the-art defenses. Notably, this work pioneers effective and stealthy backdoor attacks for regression and segmentation tasks, establishing a novel paradigm for multimodal and multi-task backdoor research.
📝 Abstract
Clean-image backdoor attacks, which use only label manipulation in training datasets to compromise deep neural networks, pose a significant threat to security-critical applications. A critical flaw in existing methods is that the poison rate required for a successful attack induces a proportional, and thus noticeable, drop in Clean Accuracy (CA), undermining their stealthiness. This paper presents a new paradigm for clean-image attacks that minimizes this accuracy degradation by optimizing the trigger itself. We introduce Generative Clean-Image Backdoors (GCB), a framework that uses a conditional InfoGAN to identify naturally occurring image features that can serve as potent and stealthy triggers. By ensuring these triggers are easily separable from benign task-related features, GCB enables a victim model to learn the backdoor from an extremely small set of poisoned examples, resulting in a CA drop of less than 1%. Our experiments demonstrate GCB's remarkable versatility, successfully adapting to six datasets, five architectures, and four tasks, including the first demonstration of clean-image backdoors in regression and segmentation. GCB also exhibits resilience against most of the existing backdoor defenses.