🤖 AI Summary
This work addresses the challenge of out-of-distribution (OOD) detection in textual attributed graphs, where OOD nodes—exhibiting unknown textual or structural patterns—often lead graph neural networks to produce overconfident yet erroneous predictions. Existing approaches struggle to jointly preserve semantic depth and topological validity. To this end, we propose LG-Plug, a plug-and-play strategy leveraging large language models (LLMs) to generate fine-grained embeddings by fusing topological and textual representations. LG-Plug introduces a consensus-driven, clustering-based iterative prompting mechanism that produces OOD exposure signals as regularization terms. It is the first method to co-optimize textual semantics and graph topology for OOD detection, incorporating a lightweight clustering codebook and heuristic sampling to minimize LLM invocation costs. Experiments demonstrate that LG-Plug significantly enhances detection accuracy and robustness, effectively mitigating the imbalance between reliability and informativeness in synthetic OOD priors.
📝 Abstract
Text-attributed graphs (TAGs) associate nodes with textual attributes and graph structure, enabling GNNs to jointly model semantic and structural information. While effective on in-distribution (ID) data, GNNs often encounter out-of-distribution (OOD) nodes with unseen textual or structural patterns in real-world settings, leading to overconfident and erroneous predictions in the absence of reliable OOD detection. Early approaches address this issue from a topology-driven perspective, leveraging neighboring structures to mitigate node-level detection bias. However, these methods typically encode node texts as shallow vector features, failing to fully exploit rich semantic information. In contrast, recent LLM-based approaches generate pseudo OOD priors by leveraging textual knowledge, but they suffer from several limitations: (1) a reliability-informativeness imbalance in the synthesized OOD priors, as the generated OOD exposures either deviate from the true OOD semantics, or introduce non-negligible ID noise, all of which offers limited improvement to detection performance; (2) reliance on specialized architectures, which prevents incorporation of the extensive effective topology-level insights that have been empirically validated in prior work. To this end, we propose LG-Plug, an LLM-Guided Plug-and-play strategy for TAG OOD detection tasks. LG-Plug aligns topology and text representations to produce fine-grained node embeddings, then generates consensus-driven OOD exposure via clustered iterative LLM prompting. Moreover, it leverages lightweight in-cluster codebook and heuristic sampling reduce time cost of LLM querying. The resulting OOD exposure serves as a regularization term to separate ID and OOD nodes, enabling seamless integration with existing detectors.