Size Transferability of Graph Transformers with Convolutional Positional Encodings

📅 2026-02-16

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Graph Transformers exhibit limited transferability across graphs of varying scales, hindering their application to large-scale graphs. This work addresses this limitation by theoretically establishing, under mild assumptions, the transferability of Graph Transformers from small to large graphs through a manifold limit analysis of graph sequences. The study further uncovers an intrinsic connection between positional encodings in Graph Transformers and graph neural networks (GNNs). The proposed approach integrates GNN-based convolutional positional encoding, manifold limit modeling, and convergence analysis of graph sequences. Empirical validation on standard benchmarks and terrain shortest-path tasks demonstrates that Graph Transformers not only match GNNs in scalability but also achieve efficient and practical transferability in real-world settings.

Technology Category

Application Category

📝 Abstract

Transformers have achieved remarkable success across domains, motivating the rise of Graph Transformers (GTs) as attention-based architectures for graph-structured data. A key design choice in GTs is the use of Graph Neural Network (GNN)-based positional encodings to incorporate structural information. In this work, we study GTs through the lens of manifold limit models for graph sequences and establish a theoretical connection between GTs with GNN positional encodings and Manifold Neural Networks (MNNs). Building on transferability results for GNNs under manifold convergence, we show that GTs inherit transferability guarantees from their positional encodings. In particular, GTs trained on small graphs provably generalize to larger graphs under mild assumptions. We complement our theory with extensive experiments on standard graph benchmarks, demonstrating that GTs exhibit scalable behavior on par with GNNs. To further show the efficiency in a real-world scenario, we implement GTs for shortest path distance estimation over terrains to better illustrate the efficiency of the transferable GTs. Our results provide new insights into the understanding of GTs and suggest practical directions for efficient training of GTs in large-scale settings.

Problem

Research questions and friction points this paper is trying to address.

Graph Transformers

Size Transferability

Positional Encodings

Manifold Neural Networks

Graph Generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Transformers

Positional Encodings

Transferability