Trapping Attacker in Dilemma: Examining Internal Correlations and External Influences of Trigger for Defending GNN Backdoors

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Graph Neural Networks (GNNs) are vulnerable to backdoor attacks, and existing defenses often fail against adaptive adversaries due to their reliance on superficial features. This work proposes PRAETORIAN, the first defense framework grounded in the intrinsic mechanisms of backdoor attacks. By jointly analyzing the internal connectivity of trigger subgraphs and the external influence of their constituent nodes, PRAETORIAN identifies anomalous injected structures and high-impact trigger nodes. This approach forces attackers into a trade-off between attack success rate and the model’s clean accuracy. Empirical results demonstrate that PRAETORIAN reduces the average attack success rate to 0.55% with only a 0.62% drop in clean accuracy, substantially outperforming current defenses and exhibiting strong robustness against diverse adaptive attacks.

📝 Abstract

GNNs have become a standard tool for learning on relational data, yet they remain highly vulnerable to backdoor attacks. Prior defenses often depend on inspecting specific subgraph patterns or node features, and thus can be circumvented by adaptive attackers. We propose PRAETORIAN, a new defense that targets intrinsic requirements of effective GNN backdoors rather than surface-level cues. Our key observation is that flipping a victim node's prediction requires substantial influence on the victim: attackers tend to either inject many trigger nodes or rely on a small set of highly influential ones. Building on this observation, PRAETORIAN (i) analyzes internal correlations within potential trigger subgraphs to detect abnormally large injected structures, and (ii) quantifies external node influence to identify triggers with disproportionate impact. Across our evaluations, PRAETORIAN reduces the average attack success rate (ASR) to 0.55% with only a 0.62% drop in clean accuracy (CA), whereas state-of-the-art defenses still yield an average ASR of >20% and a CA drop of >3% under the same conditions. Moreover, PRAETORIAN remains effective against a range of adaptive attacks, forcing adversaries to either inject many trigger nodes to achieve high ASR (>80%), which incurs a >10% CA drop, or preserve CA at the cost of limiting ASR to 18.1%. Overall, PRAETORIAN constrains attackers to an unfavorable trade-off between efficacy and detectability.

Problem

Research questions and friction points this paper is trying to address.

GNN backdoor

adversarial attack

defense mechanism

trigger detection

graph neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

GNN backdoor defense

trigger detection

node influence analysis