Invariant-Based Diagnostics for Graph Benchmarks

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing graph benchmarks struggle to disentangle whether model performance stems from node features or graph structure, thereby obscuring the true utilization of relational information. This work introduces, for the first time, graph invariants as systematic diagnostic tools, constructing task-agnostic, permutation-invariant structural descriptors to build non-trainable structural proxy models. Experiments across 26 datasets demonstrate that this simple proxy matches or even surpasses state-of-the-art GNN and Transformer baselines on structure-sensitive tasks, revealing the substantial yet previously underappreciated role of structural information in current benchmarks. These findings challenge the prevailing expressivity-centric evaluation paradigm and offer new perspectives for multitask performance prediction and structural heterogeneity analysis.

📝 Abstract

Progress on graph foundation models is hindered by benchmark practices that conflate the contributions of node features and graph structure, making it hard to tell whether a model actually learns from connectivity, or whether it even needs to. We propose addressing this using graph invariants, i.e., permutation-invariant, task-agnostic structural descriptors that serve as a diagnostic framework for graph benchmarks. We show that (i) invariants are more expressive than standard GNNs, (ii) invariants characterize structural heterogeneity within and across benchmark datasets, (iii) invariants predict multi-task performance, and (iv) simple invariant-based models are competitive with, and sometimes exceed, transformer and message-passing baselines across 26 datasets. Our results suggest that expressivity is not the main driver of predictive performance, and that on tasks where structure matters, a non-trainable structural proxy often matches trained message-passing models. We thus posit that invariant baselines should become a standard for evaluating whether structure is required for a task and whether a model picks up on it, serving as a stepping stone towards graph foundation models.

Problem

Research questions and friction points this paper is trying to address.

graph benchmarks

graph structure

node features

structural diagnostics

graph foundation models

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph invariants

graph benchmarks

structural expressivity