GraphCLIP: Enhancing Transferability in Graph Foundation Models for Text-Attributed Graphs

📅 2024-10-14

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

193K/year

🤖 AI Summary

To address weak cross-domain zero-/few-shot transferability and excessive reliance on label information in Text-Attributed Graphs (TAGs), this paper proposes a self-supervised Graph–Summary Contrastive pretraining framework. Our method introduces three key innovations: (1) a novel Graph–Summary Contrastive learning paradigm, leveraging large language models to generate high-quality graph-summary pairs for unsupervised pretraining; (2) invariant representation learning to enhance cross-domain zero-shot generalization; and (3) a graph prompt-tuning strategy aligned with the pretraining objective to mitigate catastrophic forgetting. Extensive experiments demonstrate that our approach achieves average performance gains of 12.6% (zero-shot) and 9.3% (few-shot) across multiple downstream tasks, significantly outperforming state-of-the-art methods. These results validate its strong cross-domain transferability and broad task applicability.

Technology Category

Application Category

📝 Abstract

Recently, research on Text-Attributed Graphs (TAGs) has gained significant attention due to the prevalence of free-text node features in real-world applications and the advancements in Large Language Models (LLMs) that bolster TAG methodologies. However, current TAG approaches face two primary challenges: (i) Heavy reliance on label information and (ii) Limited cross-domain zero/few-shot transferability. These issues constrain the scaling of both data and model size, owing to high labor costs and scaling laws, complicating the development of graph foundation models with strong transferability. In this work, we propose the GraphCLIP framework to address these challenges by learning graph foundation models with strong cross-domain zero/few-shot transferability through a self-supervised contrastive graph-summary pretraining method. Specifically, we generate and curate large-scale graph-summary pair data with the assistance of LLMs, and introduce a novel graph-summary pretraining method, combined with invariant learning, to enhance graph foundation models with strong cross-domain zero-shot transferability. For few-shot learning, we propose a novel graph prompt tuning technique aligned with our pretraining objective to mitigate catastrophic forgetting and minimize learning costs. Extensive experiments show the superiority of GraphCLIP in both zero-shot and few-shot settings, while evaluations across various downstream tasks confirm the versatility of GraphCLIP. Our code is available at: https://github.com/ZhuYun97/GraphCLIP

Problem

Research questions and friction points this paper is trying to address.

Enhancing cross-domain transferability in graph models

Reducing reliance on label information for TAGs

Improving zero/few-shot learning in graph foundation models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised contrastive graph-summary pretraining

Large-scale graph-summary pair data generation

Graph prompt tuning for few-shot learning

🔎 Similar Papers

UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs