Breaking the Stealth-Potency Trade-off in Clean-Image Backdoors with Generative Trigger Optimization

📅 2025-11-10

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Existing clean-label backdoor attacks face a fundamental trade-off between stealthiness—measured by minimal degradation in clean accuracy (CA)—and attack effectiveness. To address this, we propose Generative Clean-Label Backdoor (GCB), the first framework leveraging conditional InfoGAN to learn semantically coherent and disentangled trigger features directly from natural images. GCB integrates minimal poisoning—requiring fewer than 1% poisoned samples—with feature-decoupled training to substantially reduce trigger interference with the primary task. Evaluated across six benchmark datasets, five model architectures, and four task types—including classification, regression, and segmentation—GCB achieves ≤1% CA drop while maintaining high attack success rates and robustness against state-of-the-art defenses. Notably, this work pioneers effective and stealthy backdoor attacks for regression and segmentation tasks, establishing a novel paradigm for multimodal and multi-task backdoor research.

Technology Category

Application Category

📝 Abstract

Clean-image backdoor attacks, which use only label manipulation in training datasets to compromise deep neural networks, pose a significant threat to security-critical applications. A critical flaw in existing methods is that the poison rate required for a successful attack induces a proportional, and thus noticeable, drop in Clean Accuracy (CA), undermining their stealthiness. This paper presents a new paradigm for clean-image attacks that minimizes this accuracy degradation by optimizing the trigger itself. We introduce Generative Clean-Image Backdoors (GCB), a framework that uses a conditional InfoGAN to identify naturally occurring image features that can serve as potent and stealthy triggers. By ensuring these triggers are easily separable from benign task-related features, GCB enables a victim model to learn the backdoor from an extremely small set of poisoned examples, resulting in a CA drop of less than 1%. Our experiments demonstrate GCB's remarkable versatility, successfully adapting to six datasets, five architectures, and four tasks, including the first demonstration of clean-image backdoors in regression and segmentation. GCB also exhibits resilience against most of the existing backdoor defenses.

Problem

Research questions and friction points this paper is trying to address.

Minimizing clean accuracy degradation in backdoor attacks through trigger optimization

Developing stealthy triggers using naturally occurring image features via conditional InfoGAN

Enabling effective backdoor learning with minimal poisoned examples across multiple tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizes trigger using conditional InfoGAN framework

Uses naturally occurring features as stealthy triggers

Learns backdoor from minimal poisoned examples

🔎 Similar Papers

No similar papers found.