🤖 AI Summary
Graph Foundation Models (GFMs) suffer from limited cross-domain structural generalization, particularly on graphs lacking node attributes. Method: This paper proposes a two-stage structure-aware pretraining framework: (1) a self-supervised stage that predicts graph invariants (e.g., cycle count, number of connected components) to explicitly capture topological essence; and (2) a refinement stage that integrates the learned structural representations as learnable positional encodings into graph neural networks, enabling joint modeling of structure and features—without requiring node attributes. Contribution/Results: The approach significantly enhances abstract structural understanding and transferability. Experiments demonstrate consistent superiority over state-of-the-art GFMs across multi-source graph classification benchmarks, with substantial gains in few-shot and attribute-free settings. This work establishes a novel, structure-driven paradigm for graph representation learning.
📝 Abstract
This work focuses on training graph foundation models (GFMs) that have strong generalization ability in graph-level tasks such as graph classification. Effective GFM training requires capturing information consistent across different domains. We discover that graph structures provide more consistent cross-domain information compared to node features and graph labels. However, traditional GFMs primarily focus on transferring node features from various domains into a unified representation space but often lack structural cross-domain generalization. To address this, we introduce GraphProp, which emphasizes structural generalization. The training process of GraphProp consists of two main phases. First, we train a structural GFM by predicting graph invariants. Since graph invariants are properties of graphs that depend only on the abstract structure, not on particular labellings or drawings of the graph, this structural GFM has a strong ability to capture the abstract structural information and provide discriminative graph representations comparable across diverse domains. In the second phase, we use the representations given by the structural GFM as positional encodings to train a comprehensive GFM. This phase utilizes domain-specific node attributes and graph labels to further improve cross-domain node feature generalization. Our experiments demonstrate that GraphProp significantly outperforms the competitors in supervised learning and few-shot learning, especially in handling graphs without node attributes.