LLM-Powered Text-Attributed Graph Anomaly Detection via Retrieval-Augmented Reasoning

📅 2025-11-16

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the lack of standardized benchmarks and realistic anomalous text generation methods for anomaly detection in Text-Attributed Graphs (TAGs). To this end, we introduce TAG-AD—the first dedicated benchmark—leveraging Large Language Models (LLMs) to design a Retrieval-Augmented Generation (RAG) pipeline that automatically synthesizes semantically plausible yet contextually inconsistent anomalous texts, covering diverse anomaly types. We further propose a zero-shot anomaly detection framework that decouples global semantic knowledge modeling from local graph structural modeling, eliminating reliance on handcrafted prompts. Experiments demonstrate that LLMs excel at identifying contextual anomalies, whereas Graph Neural Networks (GNNs) are more effective for structural anomalies. Moreover, RAG-augmented prompting achieves performance comparable to human-designed prompts, substantially enhancing zero-shot generalization and practical applicability.

Technology Category

Application Category

📝 Abstract

Anomaly detection on attributed graphs plays an essential role in applications such as fraud detection, intrusion monitoring, and misinformation analysis. However, text-attributed graphs (TAGs), in which node information is expressed in natural language, remain underexplored, largely due to the absence of standardized benchmark datasets. In this work, we introduce TAG-AD, a comprehensive benchmark for anomaly node detection on TAGs. TAG-AD leverages large language models (LLMs) to generate realistic anomalous node texts directly in the raw text space, producing anomalies that are semantically coherent yet contextually inconsistent and thus more reflective of real-world irregularities. In addition, TAG-AD incorporates multiple other anomaly types, enabling thorough and reproducible evaluation of graph anomaly detection (GAD) methods. With these datasets, we further benchmark existing unsupervised GNN-based GAD methods as well as zero-shot LLMs for GAD. As part of our zero-shot detection setup, we propose a retrieval-augmented generation (RAG)-assisted, LLM-based zero-shot anomaly detection framework. The framework mitigates reliance on brittle, hand-crafted prompts by constructing a global anomaly knowledge base and distilling it into reusable analysis frameworks. Our experimental results reveal a clear division of strengths: LLMs are particularly effective at detecting contextual anomalies, whereas GNN-based methods remain superior for structural anomaly detection. Moreover, RAG-assisted prompting achieves performance comparable to human-designed prompts while eliminating manual prompt engineering, underscoring the practical value of our RAG-assisted zero-shot LLM anomaly detection framework.

Problem

Research questions and friction points this paper is trying to address.

Detecting anomalies in text-attributed graphs with natural language node information

Developing a benchmark for evaluating graph anomaly detection methods

Creating zero-shot LLM framework for anomaly detection without manual prompts

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM generates realistic anomalous node texts

RAG-assisted framework mitigates reliance on hand-crafted prompts

Global anomaly knowledge base distilled into reusable frameworks

🔎 Similar Papers

LLM-Enhanced User-Item Interactions: Leveraging Edge Information for Optimized Recommendations