Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors

📅 2026-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a critical limitation in current backdoor defenses, which primarily focus on removing known triggers while overlooking the fact that perceptually distinct alternative triggers can still activate the same backdoor. The study reveals that the essence of backdoors resides in the feature space rather than the input space and proposes a feature-guided attack method that jointly optimizes target prediction and alignment with the backdoor direction. For the first time, it provides both theoretical proof and empirical validation of the widespread existence of alternative triggers. Experiments demonstrate that most existing defenses fail to eliminate the backdoor even after removing the original trigger, whereas the proposed approach effectively exposes latent backdoors, thereby advancing the defense paradigm from input-space trigger removal to modeling backdoor directions in feature space.

Technology Category

Application Category

📝 Abstract
Current backdoor defenses assume that neutralizing a known trigger removes the backdoor. We show this trigger-centric view is incomplete: \emph{alternative triggers}, patterns perceptually distinct from training triggers, reliably activate the same backdoor. We estimate the alternative trigger backdoor direction in feature space by contrasting clean and triggered representations, and then develop a feature-guided attack that jointly optimizes target prediction and directional alignment. First, we theoretically prove that alternative triggers exist and are an inevitable consequence of backdoor training. Then, we verify this empirically. Additionally, defenses that remove training triggers often leave backdoors intact, and alternative triggers can exploit the latent backdoor feature-space. Our findings motivate defenses targeting backdoor directions in representation space rather than input-space triggers.
Problem

Research questions and friction points this paper is trying to address.

backdoor
alternative triggers
feature space
trigger-centric defense
latent backdoor
Innovation

Methods, ideas, or system contributions that make the work stand out.

alternative triggers
latent backdoors
feature-space backdoor direction
trigger-agnostic defense
representation space
🔎 Similar Papers
2024-01-27IEEE Transactions on Dependable and Secure ComputingCitations: 13