Large Language Models as Topological Structure Enhancers for Text-Attributed Graphs

📅 2023-11-24
🏛️ arXiv.org
📈 Citations: 14
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of high topological noise and insufficient semantic information in text-attributed graph (TAG) node classification. We propose an LLM-driven, end-to-end graph structure optimization framework. Methodologically, it comprises: (1) a prompt-engineered edge pruning and completion mechanism leveraging semantic similarity to reconstruct semantically coherent connectivity; and (2) high-quality pseudo-label generation via LLMs to jointly guide GNN learning of edge weights and node representations, augmented by pseudo-label propagation regularization. To our knowledge, this is the first systematic investigation into leveraging LLMs for graph topology enhancement—moving beyond their conventional role in feature extraction alone. Evaluated on four real-world datasets, our method achieves consistent improvements in classification accuracy, ranging from 0.15% to 2.47%, demonstrating both the efficacy of semantics-driven graph refinement and strong cross-domain generalizability.
📝 Abstract
The latest advancements in large language models (LLMs) have revolutionized the field of natural language processing (NLP). Inspired by the success of LLMs in NLP tasks, some recent work has begun investigating the potential of applying LLMs in graph learning tasks. However, most of the existing work focuses on utilizing LLMs as powerful node feature augmenters, leaving employing LLMs to enhance graph topological structures an understudied problem. In this work, we explore how to leverage the information retrieval and text generation capabilities of LLMs to refine/enhance the topological structure of text-attributed graphs (TAGs) under the node classification setting. First, we propose using LLMs to help remove unreliable edges and add reliable ones in the TAG. Specifically, we first let the LLM output the semantic similarity between node attributes through delicate prompt designs, and then perform edge deletion and edge addition based on the similarity. Second, we propose using pseudo-labels generated by the LLM to improve graph topology, that is, we introduce the pseudo-label propagation as a regularization to guide the graph neural network (GNN) in learning proper edge weights. Finally, we incorporate the two aforementioned LLM-based methods for graph topological refinement into the process of GNN training, and perform extensive experiments on four real-world datasets. The experimental results demonstrate the effectiveness of LLM-based graph topology refinement (achieving a 0.15%--2.47% performance gain on public benchmarks).
Problem

Research questions and friction points this paper is trying to address.

Enhancing graph topology using LLMs
Refining edges in text-attributed graphs
Improving GNN training with LLM-based methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs enhance graph topology
Edge modification via semantic similarity
Pseudo-labels guide GNN training
🔎 Similar Papers
No similar papers found.
S
Shengyin Sun
City University of Hong Kong
Yuxiang Ren
Yuxiang Ren
Tenure-track Assistant Professor, Nanjing University
Graph Neural NetworkAI for ScienceFoundation Model
C
Chen Ma
City University of Hong Kong
X
Xuecang Zhang
Advance Computing and Storage Lab, Huawei Technologies