The Eminence in Shadow: Exploiting Feature Boundary Ambiguity for Robust Backdoor Attacks

📅 2025-12-11

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Existing backdoor attacks against deep neural networks lack theoretical foundations and rely heavily on heuristic, brute-force search strategies. Method: This paper establishes the first rigorous theoretical framework for backdoor attacks, revealing that sparse decision boundaries inherently render models highly sensitive to minute poisoned samples. We introduce the concept of “closed-form fuzzy boundary regions” and integrate influence function analysis with margin-driven boundary manipulation to achieve highly robust and stealthy black-box attacks at an ultra-low poisoning rate (<0.1%). Contribution/Results: Our framework achieves >90% attack success rate while preserving clean accuracy with negligible degradation. It further demonstrates superior cross-model, cross-dataset, and cross-scenario transferability compared to state-of-the-art methods. By providing an interpretable and predictive theoretical foundation, this work advances both the modeling and defense of backdoor attacks.

Technology Category

Application Category

📝 Abstract

Deep neural networks (DNNs) underpin critical applications yet remain vulnerable to backdoor attacks, typically reliant on heuristic brute-force methods. Despite significant empirical advancements in backdoor research, the lack of rigorous theoretical analysis limits understanding of underlying mechanisms, constraining attack predictability and adaptability. Therefore, we provide a theoretical analysis targeting backdoor attacks, focusing on how sparse decision boundaries enable disproportionate model manipulation. Based on this finding, we derive a closed-form, ambiguous boundary region, wherein negligible relabeled samples induce substantial misclassification. Influence function analysis further quantifies significant parameter shifts caused by these margin samples, with minimal impact on clean accuracy, formally grounding why such low poison rates suffice for efficacious attacks. Leveraging these insights, we propose Eminence, an explainable and robust black-box backdoor framework with provable theoretical guarantees and inherent stealth properties. Eminence optimizes a universal, visually subtle trigger that strategically exploits vulnerable decision boundaries and effectively achieves robust misclassification with exceptionally low poison rates (< 0.1%, compared to SOTA methods typically requiring > 1%). Comprehensive experiments validate our theoretical discussions and demonstrate the effectiveness of Eminence, confirming an exponential relationship between margin poisoning and adversarial boundary manipulation. Eminence maintains > 90% attack success rate, exhibits negligible clean-accuracy loss, and demonstrates high transferability across diverse models, datasets and scenarios.

Problem

Research questions and friction points this paper is trying to address.

Theoretical analysis of sparse decision boundaries enabling disproportionate model manipulation.

Derivation of ambiguous boundary region causing misclassification with minimal relabeled samples.

Proposal of an explainable, robust black-box backdoor framework with low poison rates.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exploiting sparse decision boundaries for model manipulation

Deriving closed-form ambiguous boundary region for misclassification

Optimizing universal subtle trigger with low poison rates

🔎 Similar Papers

CEPA: Consensus Embedded Perturbation for Agnostic Detection and Inversion of Backdoors