When Do Graph Foundation Models Transfer? A Data-Centric Theory

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work addresses the unstable or even negative transfer performance commonly observed in Graph Foundation Models (GFMs) across domains, a phenomenon whose root cause remains unclear. From a data-centric perspective, the study establishes the first theoretical framework for cross-domain GFM transfer based on graphon theory. It explicitly decomposes output shift into structural discrepancy and finite-sample effects, and introduces a domain discrepancy measure independent of relabeling. The theoretical framework is validated through graphon-based continuous modeling, Lipschitz analysis of backbone networks, stability theory of spectral positional encodings, and empirical comparisons between subspace-based and eigenvector-based positional encodings on both synthetic and real-world graph data. This provides actionable guidance for data selection and construction in GFM transfer scenarios.

📝 Abstract

Graph foundation models (GFMs) aim to reuse a single backbone across diverse graph domains, yet their transfer is often uneven and can exhibit negative transfer. While most prior work improves transfer through architectural or adaptation choices, we ask a data-centric question: which properties of two graph domains determine how much a fixed representation model changes its outputs? Using a graphon-based continuous limit for dense graphs, we show that for both set-based and message-passing tokenizations, any Lipschitz backbone admits an explicit decomposition of cross-domain output shift into (i) graph-specific finite-sample approximation terms and (ii) an intrinsic, relabeling-invariant domain discrepancy capturing structural mismatch. A key ingredient is positional-encoding (PE) stability: we establish stability guarantees for spectral PEs and highlight contrasting behaviors of eigenvector- versus subspace-based PEs. Experiments on synthetic and real graphs validate the theory and translate the decomposition into guidance for data curation in GFM transfer.

Problem

Research questions and friction points this paper is trying to address.

graph foundation models

transfer learning

domain discrepancy

graphon

positional encoding

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph foundation models

domain transfer

graphon