LLM-Powered Text-Attributed Graph Anomaly Detection via Retrieval-Augmented Reasoning

📅 2025-11-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of standardized benchmarks and realistic anomalous text generation methods for anomaly detection in Text-Attributed Graphs (TAGs). To this end, we introduce TAG-AD—the first dedicated benchmark—leveraging Large Language Models (LLMs) to design a Retrieval-Augmented Generation (RAG) pipeline that automatically synthesizes semantically plausible yet contextually inconsistent anomalous texts, covering diverse anomaly types. We further propose a zero-shot anomaly detection framework that decouples global semantic knowledge modeling from local graph structural modeling, eliminating reliance on handcrafted prompts. Experiments demonstrate that LLMs excel at identifying contextual anomalies, whereas Graph Neural Networks (GNNs) are more effective for structural anomalies. Moreover, RAG-augmented prompting achieves performance comparable to human-designed prompts, substantially enhancing zero-shot generalization and practical applicability.

Technology Category

Application Category

📝 Abstract
Anomaly detection on attributed graphs plays an essential role in applications such as fraud detection, intrusion monitoring, and misinformation analysis. However, text-attributed graphs (TAGs), in which node information is expressed in natural language, remain underexplored, largely due to the absence of standardized benchmark datasets. In this work, we introduce TAG-AD, a comprehensive benchmark for anomaly node detection on TAGs. TAG-AD leverages large language models (LLMs) to generate realistic anomalous node texts directly in the raw text space, producing anomalies that are semantically coherent yet contextually inconsistent and thus more reflective of real-world irregularities. In addition, TAG-AD incorporates multiple other anomaly types, enabling thorough and reproducible evaluation of graph anomaly detection (GAD) methods. With these datasets, we further benchmark existing unsupervised GNN-based GAD methods as well as zero-shot LLMs for GAD. As part of our zero-shot detection setup, we propose a retrieval-augmented generation (RAG)-assisted, LLM-based zero-shot anomaly detection framework. The framework mitigates reliance on brittle, hand-crafted prompts by constructing a global anomaly knowledge base and distilling it into reusable analysis frameworks. Our experimental results reveal a clear division of strengths: LLMs are particularly effective at detecting contextual anomalies, whereas GNN-based methods remain superior for structural anomaly detection. Moreover, RAG-assisted prompting achieves performance comparable to human-designed prompts while eliminating manual prompt engineering, underscoring the practical value of our RAG-assisted zero-shot LLM anomaly detection framework.
Problem

Research questions and friction points this paper is trying to address.

Detecting anomalies in text-attributed graphs with natural language node information
Developing a benchmark for evaluating graph anomaly detection methods
Creating zero-shot LLM framework for anomaly detection without manual prompts
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM generates realistic anomalous node texts
RAG-assisted framework mitigates reliance on hand-crafted prompts
Global anomaly knowledge base distilled into reusable frameworks
🔎 Similar Papers
No similar papers found.
Haoyan Xu
Haoyan Xu
University of Southern California
Machine Learning
R
Ruizhi Qian
University of Southern California
Z
Zhengtao Yao
University of Southern California
Z
Ziyi Liu
University of Southern California
L
Li Li
University of Southern California
Y
Yuqi Li
City College of New York
Yanshu Li
Yanshu Li
Brown University
NLPMultimodal Learning
Wenqing Zheng
Wenqing Zheng
The University of Texas at Austin
Machine LearningGraph Neural NetworksComputer VisionSymbolic Reasoning
D
Daniele Rosa
Capital One
D
Daniel Barcklow
Capital One
Senthil Kumar
Senthil Kumar
Bell Laboratories
Machine LearningComputer VisionPattern RecognitionData AnalysisStatistics
Jieyu Zhao
Jieyu Zhao
Assistant Professor at USC
Natural Language ProcessingMachine LearningFairness in AI
Y
Yue Zhao
University of Southern California