Training A Foundation Model to Represent Graphs as Vectors

📅 2026-02-04

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work proposes a universal graph foundation model designed to encode arbitrary graphs into vector representations that preserve both structural and semantic information, thereby supporting graph-level tasks and enabling cross-domain generalization. The approach integrates a multi-graph feature alignment mechanism with a density-maximized mean alignment algorithm to enhance consistency of node embeddings across datasets. Discriminative graph representations are learned through graph neural networks combined with contrastive learning, while a novel pooling-free, multi-layer reference distribution module efficiently aggregates node-level information into graph-level representations. Theoretical analysis provides an upper bound on the generalization error. Extensive experiments demonstrate that the model significantly outperforms strong baselines on few-shot graph classification and clustering tasks, validating its superior representational capacity and generalization ability.

Technology Category

Application Category

📝 Abstract

This paper aims to train a graph foundation model that is able to represent any graph as a vector preserving structural and semantic information useful for downstream graph-level tasks such as graph classification and graph clustering. To learn the features of graphs from diverse domains while maintaining strong generalization ability to new domains, we propose a multi-graph-based feature alignment method, which constructs weighted graphs using the attributes of all nodes in each dataset and then generates consistent node embeddings. To enhance the consistency of the features from different datasets, we propose a density maximization mean alignment algorithm with guaranteed convergence. The original graphs and generated node embeddings are fed into a graph neural network to achieve discriminative graph representations in contrastive learning. More importantly, to enhance the information preservation from node-level representations to the graph-level representation, we construct a multi-layer reference distribution module without using any pooling operation. We also provide a theoretical generalization bound to support the effectiveness of the proposed model. The experimental results of few-shot graph classification and graph clustering show that our model outperforms strong baselines.

Problem

Research questions and friction points this paper is trying to address.

graph foundation model

graph representation

structural information

semantic information

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph foundation model

feature alignment

density maximization