An Effective Approach for Node Classification in Textual Graphs

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing text-attributed graph (TAG) node classification methods suffer from weak domain-term modeling, difficulty capturing long-range dependencies, poor adaptability to temporal evolution, and limited scalability. To address these challenges, this paper proposes TAPE-Graphormer: (1) leveraging large language models (e.g., ChatGPT) to generate domain-aware semantic explanations for nodes, thereby enriching textual representations; (2) introducing path-aware positional encoding to explicitly model multi-hop structural dependencies in graphs; and (3) jointly fusing semantic and topological features via attention mechanisms. Evaluated on ogbn-arxiv, TAPE-Graphormer achieves 77.2% accuracy—substantially outperforming GCN baselines—and attains 67.1% precision, 57.7% recall, and 61.0% F1-score. These results demonstrate the effectiveness and generalizability of synergistic semantic guidance and structure-aware modeling for TAG node classification.

Technology Category

Application Category

📝 Abstract
Textual Attribute Graphs (TAGs) are critical for modeling complex networks like citation networks, but effective node classification remains challenging due to difficulties in integrating rich semantics from text with structural graph information. Existing methods often struggle with capturing nuanced domain-specific terminology, modeling long-range dependencies, adapting to temporal evolution, and scaling to massive datasets. To address these issues, we propose a novel framework that integrates TAPE (Text-Attributed Graph Representation Enhancement) with Graphormer. Our approach leverages a large language model (LLM), specifically ChatGPT, within the TAPE framework to generate semantically rich explanations from paper content, which are then fused into enhanced node representations. These embeddings are combined with structural features using a novel integration layer with learned attention weights. Graphormer's path-aware position encoding and multi-head attention mechanisms are employed to effectively capture long-range dependencies across the citation network. We demonstrate the efficacy of our framework on the challenging ogbn-arxiv dataset, achieving state-of-the-art performance with a classification accuracy of 0.772, significantly surpassing the best GCN baseline of 0.713. Our method also yields strong results in precision (0.671), recall (0.577), and F1-score (0.610). We validate our approach through comprehensive ablation studies that quantify the contribution of each component, demonstrating the synergy between semantic and structural information. Our framework provides a scalable and robust solution for node classification in dynamic TAGs, offering a promising direction for future research in knowledge systems and scientific discovery.
Problem

Research questions and friction points this paper is trying to address.

Integrating text semantics with graph structure for node classification
Capturing domain-specific terminology and long-range dependencies
Scaling node classification for dynamic and massive datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates TAPE with Graphormer for enhanced node representation
Uses ChatGPT to generate semantically rich text explanations
Employs path-aware encoding to capture long-range dependencies
🔎 Similar Papers
No similar papers found.