A Survey on Self-Supervised Graph Foundation Models: Knowledge-Based Perspective

📅 2024-03-24

📈 Citations: 1

✨ Influential: 0

career value

222K/year

🤖 AI Summary

Existing surveys on graph foundation models (GFMs) suffer from outdated coverage, ambiguous taxonomies of self-supervised methods, and an overreliance on architecture-specific perspectives—hindering systematic understanding of general graph knowledge learning. To address these limitations, we propose a knowledge-dimensional, three-tiered classification framework (micro–meso–macro), encompassing nine categories of graph knowledge and over 25 pretraining tasks, unifying multi-level representations of nodes, structures, and semantics. We introduce the first knowledge-guided taxonomy for self-supervised GFMs, shifting away from traditional architecture-centric paradigms to accommodate emerging directions such as graph language models. Furthermore, we establish explicit mappings among knowledge types, pretraining tasks, and generalization strategies. This framework comprehensively covers state-of-the-art advances and significantly enhances model interpretability, downstream generalization capability, and cross-task reusability.

Technology Category

Application Category

📝 Abstract

Graph self-supervised learning (SSL) is now a go-to method for pre-training graph foundation models (GFMs). There is a wide variety of knowledge patterns embedded in the graph data, such as node properties and clusters, which are crucial to learning generalized representations for GFMs. However, existing surveys of GFMs have several shortcomings: they lack comprehensiveness regarding the most recent progress, have unclear categorization of self-supervised methods, and take a limited architecture-based perspective that is restricted to only certain types of graph models. As the ultimate goal of GFMs is to learn generalized graph knowledge, we provide a comprehensive survey of self-supervised GFMs from a novel knowledge-based perspective. We propose a knowledge-based taxonomy, which categorizes self-supervised graph models by the specific graph knowledge utilized. Our taxonomy consists of microscopic (nodes, links, etc.), mesoscopic (context, clusters, etc.), and macroscopic knowledge (global structure, manifolds, etc.). It covers a total of 9 knowledge categories and more than 25 pretext tasks for pre-training GFMs, as well as various downstream task generalization strategies. Such a knowledge-based taxonomy allows us to re-examine graph models based on new architectures more clearly, such as graph language models, as well as provide more in-depth insights for constructing GFMs.

Problem

Research questions and friction points this paper is trying to address.

Survey self-supervised graph models from knowledge perspective

Classify graph SSL methods by knowledge patterns used

Propose taxonomy covering 9 knowledge categories for GFMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes knowledge-based taxonomy for graph models

Categorizes graph knowledge into microscopic, mesoscopic, macroscopic

Covers 9 knowledge categories and 25 pretext tasks

🔎 Similar Papers

Towards Graph Foundation Models: A Survey and Beyond