A Survey on Self-Supervised Graph Foundation Models: Knowledge-Based Perspective

📅 2024-03-24
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing surveys on graph foundation models (GFMs) suffer from outdated coverage, ambiguous taxonomies of self-supervised methods, and an overreliance on architecture-specific perspectives—hindering systematic understanding of general graph knowledge learning. To address these limitations, we propose a knowledge-dimensional, three-tiered classification framework (micro–meso–macro), encompassing nine categories of graph knowledge and over 25 pretraining tasks, unifying multi-level representations of nodes, structures, and semantics. We introduce the first knowledge-guided taxonomy for self-supervised GFMs, shifting away from traditional architecture-centric paradigms to accommodate emerging directions such as graph language models. Furthermore, we establish explicit mappings among knowledge types, pretraining tasks, and generalization strategies. This framework comprehensively covers state-of-the-art advances and significantly enhances model interpretability, downstream generalization capability, and cross-task reusability.

Technology Category

Application Category

📝 Abstract
Graph self-supervised learning (SSL) is now a go-to method for pre-training graph foundation models (GFMs). There is a wide variety of knowledge patterns embedded in the graph data, such as node properties and clusters, which are crucial to learning generalized representations for GFMs. However, existing surveys of GFMs have several shortcomings: they lack comprehensiveness regarding the most recent progress, have unclear categorization of self-supervised methods, and take a limited architecture-based perspective that is restricted to only certain types of graph models. As the ultimate goal of GFMs is to learn generalized graph knowledge, we provide a comprehensive survey of self-supervised GFMs from a novel knowledge-based perspective. We propose a knowledge-based taxonomy, which categorizes self-supervised graph models by the specific graph knowledge utilized. Our taxonomy consists of microscopic (nodes, links, etc.), mesoscopic (context, clusters, etc.), and macroscopic knowledge (global structure, manifolds, etc.). It covers a total of 9 knowledge categories and more than 25 pretext tasks for pre-training GFMs, as well as various downstream task generalization strategies. Such a knowledge-based taxonomy allows us to re-examine graph models based on new architectures more clearly, such as graph language models, as well as provide more in-depth insights for constructing GFMs.
Problem

Research questions and friction points this paper is trying to address.

Survey self-supervised graph models from knowledge perspective
Classify graph SSL methods by knowledge patterns used
Propose taxonomy covering 9 knowledge categories for GFMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes knowledge-based taxonomy for graph models
Categorizes graph knowledge into microscopic, mesoscopic, macroscopic
Covers 9 knowledge categories and 25 pretext tasks
Z
Zi-qiang Zhao
School of Computer Science and Technology, Huazhong University of Science and Technology
Yixin Su
Yixin Su
Huazhong University of Science and Technology
Large Language ModelsPersonalizationRecommender systemsGraph Neural Networks
Y
Yuhua Li
School of Computer Science and Technology, Huazhong University of Science and Technology
Yixiong Zou
Yixiong Zou
Huazhong University of Science and Technology
Computer visionDomain generalizationFew-shot learningVision-language model
R
Ruixuan Li
School of Computer Science and Technology, Huazhong University of Science and Technology
R
Rui Zhang
www.ruizhang.info