Autonomous Chain-of-Thought Distillation for Graph-Based Fraud Detection

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work proposes FraudCoT, a novel framework addressing key challenges in fraud detection on textual attributed graphs—namely, the difficulty of integrating semantic and structural information, insufficient reasoning autonomy in large model–enhanced graph neural networks, and decoupled training dynamics. FraudCoT introduces a graph-aware chain-of-thought reasoning mechanism and incorporates fraud-aware selective chain-of-thought distillation alongside an asymmetric co-training strategy. These innovations effectively enhance semantic-structural alignment and significantly improve training efficiency. Experimental results demonstrate that FraudCoT achieves up to an 8.8% relative improvement in AUPRC over state-of-the-art methods on both public and industrial datasets, while accelerating training throughput by up to 1,066×.

Technology Category

Application Category

📝 Abstract

Graph-based fraud detection on text-attributed graphs (TAGs) requires jointly modeling rich textual semantics and relational dependencies. However, existing LLM-enhanced GNN approaches are constrained by predefined prompting and decoupled training pipelines, limiting reasoning autonomy and weakening semantic-structural alignment. We propose FraudCoT, a unified framework that advances TAG-based fraud detection through autonomous, graph-aware chain-of-thought (CoT) reasoning and scalable LLM-GNN co-training. To address the limitations of predefined prompts, we introduce a fraud-aware selective CoT distillation mechanism that generates diverse reasoning paths and enhances semantic-structural understanding. These distilled CoTs are integrated into node texts, providing GNNs with enriched, multi-hop semantic and structural cues for fraud detection. Furthermore, we develop an efficient asymmetric co-training strategy that enables end-to-end optimization while significantly reducing the computational cost of naive joint training. Extensive experiments on public and industrial benchmarks demonstrate that FraudCoT achieves up to 8.8% AUPRC improvement over state-of-the-art methods and delivers up to 1,066x speedup in training throughput, substantially advancing both detection performance and efficiency.

Problem

Research questions and friction points this paper is trying to address.

graph-based fraud detection

text-attributed graphs

semantic-structural alignment

reasoning autonomy

LLM-enhanced GNN

Innovation

Methods, ideas, or system contributions that make the work stand out.

autonomous chain-of-thought

graph-based fraud detection

LLM-GNN co-training