Krait: A Backdoor Attack Against Graph Prompt Tuning

📅 2024-07-18

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

This work reveals that graph prompt tuning (GPT) is highly vulnerable to backdoor attacks in few-shot learning scenarios, with existing detection methods proving ineffective. To address this, we propose Krait—the first backdoor attack framework specifically designed for graph prompts. Krait introduces a novel graph-prompt-level attack paradigm, three customizable trigger generation strategies, a label non-uniform homophily metric to quantify structural sensitivity, and a centroid similarity loss to jointly optimize attack success rate and stealthiness. Evaluated on four real-world graph datasets, Krait achieves near-perfect attack success rates (≈100%) by poisoning only 0.15%–2% of training nodes (e.g., as few as two nodes for one-to-one attacks), without degrading clean accuracy. Moreover, the attacks exhibit strong transferability to black-box settings and effectively evade mainstream defenses.

Technology Category

Application Category

📝 Abstract

Graph prompt tuning has emerged as a promising paradigm to effectively transfer general graph knowledge from pre-trained models to various downstream tasks, particularly in few-shot contexts. However, its susceptibility to backdoor attacks, where adversaries insert triggers to manipulate outcomes, raises a critical concern. We conduct the first study to investigate such vulnerability, revealing that backdoors can disguise benign graph prompts, thus evading detection. We introduce Krait, a novel graph prompt backdoor. Specifically, we propose a simple yet effective model-agnostic metric called label non-uniformity homophily to select poisoned candidates, significantly reducing computational complexity. To accommodate diverse attack scenarios and advanced attack types, we design three customizable trigger generation methods to craft prompts as triggers. We propose a novel centroid similarity-based loss function to optimize prompt tuning for attack effectiveness and stealthiness. Experiments on four real-world graphs demonstrate that Krait can efficiently embed triggers to merely 0.15% to 2% of training nodes, achieving high attack success rates without sacrificing clean accuracy. Notably, in one-to-one and all-to-one attacks, Krait can achieve 100% attack success rates by poisoning as few as 2 and 22 nodes, respectively. Our experiments further show that Krait remains potent across different transfer cases, attack types, and graph neural network backbones. Additionally, Krait can be successfully extended to the black-box setting, posing more severe threats. Finally, we analyze why Krait can evade both classical and state-of-the-art defenses, and provide practical insights for detecting and mitigating this class of attacks.

Problem

Research questions and friction points this paper is trying to address.

Investigates vulnerability of graph prompt tuning to backdoor attacks

Proposes Krait, a stealthy graph prompt backdoor attack method

Demonstrates high attack success with minimal poisoned nodes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Label non-uniformity homophily for poisoned candidate selection

Customizable trigger generation methods for diverse attacks

Centroid similarity-based loss function for stealthy optimization

🔎 Similar Papers

No similar papers found.