Graph neural networks extrapolate out-of-distribution for shortest paths

📅 2025-03-24

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This paper addresses the limited out-of-distribution (OOD) generalization of graph neural networks (GNNs) on large-scale graphs for shortest-path computation. We propose an algorithm-aligned, scalable learning framework that guides GNNs to implicitly implement the Bellman–Ford algorithm via a dynamic-programming-inspired architecture and a sparsity-regularized loss function. Crucially, we provide the first rigorous theoretical proof that, under sparse regularization, GNNs exactly or approximately reproduce Bellman–Ford’s iterative update mechanism. This theoretical guarantee enables zero-shot cross-scale extrapolation: models trained solely on small graphs generalize robustly to arbitrarily large, unseen graphs. Empirical results demonstrate substantial OOD generalization gains on large-scale graphs and validate the framework’s theoretical interpretability and algorithmic fidelity.

Technology Category

Application Category

📝 Abstract

Neural networks (NNs), despite their success and wide adoption, still struggle to extrapolate out-of-distribution (OOD), i.e., to inputs that are not well-represented by their training dataset. Addressing the OOD generalization gap is crucial when models are deployed in environments significantly different from the training set, such as applying Graph Neural Networks (GNNs) trained on small graphs to large, real-world graphs. One promising approach for achieving robust OOD generalization is the framework of neural algorithmic alignment, which incorporates ideas from classical algorithms by designing neural architectures that resemble specific algorithmic paradigms (e.g. dynamic programming). The hope is that trained models of this form would have superior OOD capabilities, in much the same way that classical algorithms work for all instances. We rigorously analyze the role of algorithmic alignment in achieving OOD generalization, focusing on graph neural networks (GNNs) applied to the canonical shortest path problem. We prove that GNNs, trained to minimize a sparsity-regularized loss over a small set of shortest path instances, exactly implement the Bellman-Ford (BF) algorithm for shortest paths. In fact, if a GNN minimizes this loss within an error of $epsilon$, it implements the BF algorithm with an error of $O(epsilon)$. Consequently, despite limited training data, these GNNs are guaranteed to extrapolate to arbitrary shortest-path problems, including instances of any size. Our empirical results support our theory by showing that NNs trained by gradient descent are able to minimize this loss and extrapolate in practice.

Problem

Research questions and friction points this paper is trying to address.

Graph neural networks struggle with out-of-distribution generalization

Neural algorithmic alignment improves OOD generalization in GNNs

GNNs trained for shortest paths implement Bellman-Ford algorithm

Innovation

Methods, ideas, or system contributions that make the work stand out.

GNNs implement Bellman-Ford algorithm

Sparsity-regularized loss ensures OOD generalization

Training on small graphs extrapolates to large graphs

🔎 Similar Papers

Over-Squashing in Graph Neural Networks: A Comprehensive survey