Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers

📅 2024-06-27
🏛️ Neural Information Processing Systems
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing tokenized graph Transformers construct token sequences solely from highly similar nodes, leading to insufficient exploitation of graph structural diversity. To address this, we propose GCFormer: (1) a hybrid token generator that simultaneously constructs positive and negative token sequences; (2) a customized Transformer backbone, integrated with contrastive learning for the first time in tokenized graph Transformer frameworks to enable discriminative cross-sequence representation learning; and (3) a node-level positive/negative sampling strategy to enhance local structural awareness. Evaluated on both homophilic and heterophilic graph benchmarks, GCFormer achieves average accuracy gains of 3.2–5.8% over state-of-the-art GNNs and graph Transformers on node classification tasks. These improvements demonstrate substantially enhanced utilization of graph information and superior node representation capability.

Technology Category

Application Category

📝 Abstract
While tokenized graph Transformers have demonstrated strong performance in node classification tasks, their reliance on a limited subset of nodes with high similarity scores for constructing token sequences overlooks valuable information from other nodes, hindering their ability to fully harness graph information for learning optimal node representations. To address this limitation, we propose a novel graph Transformer called GCFormer. Unlike previous approaches, GCFormer develops a hybrid token generator to create two types of token sequences, positive and negative, to capture diverse graph information. And a tailored Transformer-based backbone is adopted to learn meaningful node representations from these generated token sequences. Additionally, GCFormer introduces contrastive learning to extract valuable information from both positive and negative token sequences, enhancing the quality of learned node representations. Extensive experimental results across various datasets, including homophily and heterophily graphs, demonstrate the superiority of GCFormer in node classification, when compared to representative graph neural networks (GNNs) and graph Transformers.
Problem

Research questions and friction points this paper is trying to address.

Improving node representation learning in tokenized graph Transformers
Addressing overlooked information from nodes with low similarity
Enhancing node classification performance across diverse graph types
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid token generator creates positive and negative sequences
Transformer backbone learns from diverse token sequences
Contrastive learning enhances node representation quality
🔎 Similar Papers
No similar papers found.
Jinsong Chen
Jinsong Chen
Central China Normal University
Graph Representation LearningGraph Data MiningAI for Education
H
Hanpeng Liu
School of Computer Science and Technology, Huazhong University of Science and Technology; Hopcroft Center on Computing Science, Huazhong University of Science and Technology
J
J. Hopcroft
Department of Computer Science, Cornell University
K
Kun He
School of Computer Science and Technology, Huazhong University of Science and Technology; Hopcroft Center on Computing Science, Huazhong University of Science and Technology