Toward a universal foundation model for graph-structured data

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited cross-domain generalization of existing graph neural networks, which heavily rely on domain-specific node features and thus underperform in settings with substantial structural heterogeneity, such as biomedical networks. To overcome this, the authors propose a universal graph foundation model that leverages feature-agnostic structural prompts—including degree distribution, centrality, community structure, and diffusion signatures—to guide a message-passing backbone. This enables the embedding of diverse graphs into a unified representation space, supporting “pretrain once, reuse everywhere” transferability. Remarkably, the model generates transferable graph representations without requiring node identities or original features, matching or surpassing strong supervised baselines across multiple benchmarks. On SagePPI, fine-tuning yields a ROC-AUC of 95.5%, outperforming the best supervised method by 21.8% and substantially improving zero-shot and few-shot generalization.
📝 Abstract
Graphs are a central representation in biomedical research, capturing molecular interaction networks, gene regulatory circuits, cell--cell communication maps, and knowledge graphs. Despite their importance, currently there is not a broadly reusable foundation model available for graph analysis comparable to those that have transformed language and vision. Existing graph neural networks are typically trained on a single dataset and learn representations specific only to that graph's node features, topology, and label space, limiting their ability to transfer across domains. This lack of generalization is particularly problematic in biology and medicine, where networks vary substantially across cohorts, assays, and institutions. Here we introduce a graph foundation model designed to learn transferable structural representations that are not specific to specific node identities or feature schemes. Our approach leverages feature-agnostic graph properties, including degree statistics, centrality measures, community structure indicators, and diffusion-based signatures, and encodes them as structural prompts. These prompts are integrated with a message-passing backbone to embed diverse graphs into a shared representation space. The model is pretrained once on heterogeneous graphs and subsequently reused on unseen datasets with minimal adaptation. Across multiple benchmarks, our pretrained model matches or exceeds strong supervised baselines while demonstrating superior zero-shot and few-shot generalization on held-out graphs. On the SagePPI benchmark, supervised fine-tuning of the pretrained backbone achieves a mean ROC-AUC of 95.5%, a gain of 21.8% over the best supervised message-passing baseline. The proposed technique thus provides a unique approach toward reusable, foundation-scale models for graph-structured data in biomedical and network science applications.
Problem

Research questions and friction points this paper is trying to address.

graph foundation model
transfer learning
graph neural networks
generalization
biomedical graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

graph foundation model
structural prompts
feature-agnostic representation
transferable graph learning
zero-shot generalization
🔎 Similar Papers
No similar papers found.
Sakib Mostafa
Sakib Mostafa
Postdoctoral Fellow at Stanford University
Deep LearningGenomicsComputer Vision
Lei Xing
Lei Xing
stanford university
M
Md. Tauhidul Islam
Department of Radiation Oncology, Stanford University, Stanford, CA, USA