Poison in the Well: Feature Embedding Disruption in Backdoor Attacks

๐Ÿ“… 2025-05-26
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing backdoor attacks suffer from strong dependence on training data, poor stealthiness, and insufficient robustness. This paper proposes ShadowPrintโ€”a novel backdoor attack that requires no access to training data and does not contaminate labels. ShadowPrint achieves targeted alignment of trigger samples in the feature space via clustering-driven perturbation of intermediate-layer features, and introduces a clean-label trigger mechanism with an extremely low poisoning rate (as low as 0.01%). Evaluated across multiple datasets (CIFAR-10, ImageNet) and architectures (ResNet, ViT), ShadowPrint achieves a 100% attack success rate, reduces clean accuracy by โ‰ค1%, and incurs an average detection rejection rate of <5%. These results demonstrate significant improvements in stealthiness, robustness, and practicality over prior methods.

Technology Category

Application Category

๐Ÿ“ Abstract
Backdoor attacks embed malicious triggers into training data, enabling attackers to manipulate neural network behavior during inference while maintaining high accuracy on benign inputs. However, existing backdoor attacks face limitations manifesting in excessive reliance on training data, poor stealth, and instability, which hinder their effectiveness in real-world applications. Therefore, this paper introduces ShadowPrint, a versatile backdoor attack that targets feature embeddings within neural networks to achieve high ASRs and stealthiness. Unlike traditional approaches, ShadowPrint reduces reliance on training data access and operates effectively with exceedingly low poison rates (as low as 0.01%). It leverages a clustering-based optimization strategy to align feature embeddings, ensuring robust performance across diverse scenarios while maintaining stability and stealth. Extensive evaluations demonstrate that ShadowPrint achieves superior ASR (up to 100%), steady CA (with decay no more than 1% in most cases), and low DDR (averaging below 5%) across both clean-label and dirty-label settings, and with poison rates ranging from as low as 0.01% to 0.05%, setting a new standard for backdoor attack capabilities and emphasizing the need for advanced defense strategies focused on feature space manipulations.
Problem

Research questions and friction points this paper is trying to address.

Enhancing backdoor attack stealth and stability
Reducing reliance on training data access
Achieving high attack success with low poison rates
Innovation

Methods, ideas, or system contributions that make the work stand out.

Targets feature embeddings for high ASRs
Uses clustering-based optimization strategy
Operates effectively with 0.01% poison rate
๐Ÿ”Ž Similar Papers
No similar papers found.