When Do Graph Foundation Models Transfer? A Data-Centric Theory

📅 2026-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the unstable or even negative transfer performance commonly observed in Graph Foundation Models (GFMs) across domains, a phenomenon whose root cause remains unclear. From a data-centric perspective, the study establishes the first theoretical framework for cross-domain GFM transfer based on graphon theory. It explicitly decomposes output shift into structural discrepancy and finite-sample effects, and introduces a domain discrepancy measure independent of relabeling. The theoretical framework is validated through graphon-based continuous modeling, Lipschitz analysis of backbone networks, stability theory of spectral positional encodings, and empirical comparisons between subspace-based and eigenvector-based positional encodings on both synthetic and real-world graph data. This provides actionable guidance for data selection and construction in GFM transfer scenarios.
📝 Abstract
Graph foundation models (GFMs) aim to reuse a single backbone across diverse graph domains, yet their transfer is often uneven and can exhibit negative transfer. While most prior work improves transfer through architectural or adaptation choices, we ask a data-centric question: which properties of two graph domains determine how much a fixed representation model changes its outputs? Using a graphon-based continuous limit for dense graphs, we show that for both set-based and message-passing tokenizations, any Lipschitz backbone admits an explicit decomposition of cross-domain output shift into (i) graph-specific finite-sample approximation terms and (ii) an intrinsic, relabeling-invariant domain discrepancy capturing structural mismatch. A key ingredient is positional-encoding (PE) stability: we establish stability guarantees for spectral PEs and highlight contrasting behaviors of eigenvector- versus subspace-based PEs. Experiments on synthetic and real graphs validate the theory and translate the decomposition into guidance for data curation in GFM transfer.
Problem

Research questions and friction points this paper is trying to address.

graph foundation models
transfer learning
domain discrepancy
graphon
positional encoding
Innovation

Methods, ideas, or system contributions that make the work stand out.

graph foundation models
domain transfer
graphon
positional encoding stability
structural discrepancy