🤖 AI Summary
The lack of large-scale, realistic, and comprehensively annotated text-attributed graph benchmarks hinders progress in fake news detection for large language models (LLMs). Method: We introduce TG-FakeNews—the first large-scale text-attributed graph dataset tailored for graph-based anomaly detection—integrating real-world social propagation structures with fine-grained news content annotations, including veracity labels, semantic attributes, and diffusion paths. We propose a unified text-attributed graph modeling framework that enables end-to-end alignment of structural topology and semantic features. Contribution/Results: We publicly release the dataset, benchmark code, and pre-trained model interfaces. TG-FakeNews supports joint evaluation of both traditional graph neural networks and LLM-enhanced graph models, filling a critical gap in high-quality, graph-centric fake news detection benchmarks. This work significantly advances trustworthy AI and fosters LLM-driven graph anomaly detection research.
📝 Abstract
Large Language Models (LLMs) have recently revolutionized machine learning on text-attributed graphs, but the application of LLMs to graph outlier detection, particularly in the context of fake news detection, remains significantly underexplored. One of the key challenges is the scarcity of large-scale, realistic, and well-annotated datasets that can serve as reliable benchmarks for outlier detection. To bridge this gap, we introduce TAGFN, a large-scale, real-world text-attributed graph dataset for outlier detection, specifically fake news detection. TAGFN enables rigorous evaluation of both traditional and LLM-based graph outlier detection methods. Furthermore, it facilitates the development of misinformation detection capabilities in LLMs through fine-tuning. We anticipate that TAGFN will be a valuable resource for the community, fostering progress in robust graph-based outlier detection and trustworthy AI. The dataset is publicly available at https://huggingface.co/datasets/kayzliu/TAGFN and our code is available at https://github.com/kayzliu/tagfn.