🤖 AI Summary
This paper systematically evaluates the robustness of Graph Large Language Models (GraphLLMs) against adversarial perturbations in three dimensions: textual input, graph structure, and prompting—establishing the first unified 3D evaluation framework. Methodologically, it integrates state-of-the-art attack strategies: synonym substitution (text), edge perturbation (structure), and random label-set reordering (prompt), and proposes a joint defense mechanism combining structure-aware data augmentation and adversarial training. Extensive experiments across six cross-domain benchmark datasets reveal that fine-tuning merely a few tokens in node texts causes significant performance degradation; structural and prompt perturbations also induce substantial accuracy drops. The proposed defense improves average robustness by over 20%. To foster reproducible research, the authors open-source a standardized, extensible evaluation library—providing both a rigorous benchmark and practical toolkit for GraphLLM security analysis.
📝 Abstract
Inspired by the success of large language models (LLMs), there is a significant research shift from traditional graph learning methods to LLM-based graph frameworks, formally known as GraphLLMs. GraphLLMs leverage the reasoning power of LLMs by integrating three key components: the textual attributes of input nodes, the structural information of node neighborhoods, and task-specific prompts that guide decision-making. Despite their promise, the robustness of GraphLLMs against adversarial perturbations remains largely unexplored-a critical concern for deploying these models in high-stakes scenarios. To bridge the gap, we introduce TrustGLM, a comprehensive study evaluating the vulnerability of GraphLLMs to adversarial attacks across three dimensions: text, graph structure, and prompt manipulations. We implement state-of-the-art attack algorithms from each perspective to rigorously assess model resilience. Through extensive experiments on six benchmark datasets from diverse domains, our findings reveal that GraphLLMs are highly susceptible to text attacks that merely replace a few semantically similar words in a node's textual attribute. We also find that standard graph structure attack methods can significantly degrade model performance, while random shuffling of the candidate label set in prompt templates leads to substantial performance drops. Beyond characterizing these vulnerabilities, we investigate defense techniques tailored to each attack vector through data-augmented training and adversarial training, which show promising potential to enhance the robustness of GraphLLMs. We hope that our open-sourced library will facilitate rapid, equitable evaluation and inspire further innovative research in this field.