GraphBench: Next-generation graph learning benchmarking

📅 2025-12-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Graph machine learning has long suffered from benchmark fragmentation: datasets are task-specific, evaluation protocols lack standardization, and out-of-distribution (OOD) generalization is rarely considered—severely hindering reproducibility and cross-model comparison. To address this, we introduce GraphBench, the first cross-domain, multi-task graph learning benchmark platform, supporting node-, edge-, and graph-level classification as well as generative tasks. GraphBench features standardized data splits, a unified evaluation protocol, an automated hyperparameter tuning framework, and—uniquely—integrates OOD generalization metrics into its core evaluation suite. We establish authoritative baselines using message-passing GNNs and graph Transformers, conducting systematic evaluations across 12 diverse datasets. GraphBench significantly improves evaluation consistency and result comparability, providing a reproducible, scalable, and standardized infrastructure for graph learning research.

Technology Category

Application Category

📝 Abstract
Machine learning on graphs has recently achieved impressive progress in various domains, including molecular property prediction and chip design. However, benchmarking practices remain fragmented, often relying on narrow, task-specific datasets and inconsistent evaluation protocols, which hampers reproducibility and broader progress. To address this, we introduce GraphBench, a comprehensive benchmarking suite that spans diverse domains and prediction tasks, including node-level, edge-level, graph-level, and generative settings. GraphBench provides standardized evaluation protocols -- with consistent dataset splits and performance metrics that account for out-of-distribution generalization -- as well as a unified hyperparameter tuning framework. Additionally, we benchmark GraphBench using message-passing neural networks and graph transformer models, providing principled baselines and establishing a reference performance. See www.graphbench.io for further details.
Problem

Research questions and friction points this paper is trying to address.

Standardizes graph learning evaluation across diverse tasks and domains
Addresses fragmented benchmarking with consistent protocols and metrics
Establishes baselines for reproducibility and generalization in graph models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Comprehensive benchmarking suite for diverse graph tasks
Standardized evaluation protocols with consistent dataset splits
Unified hyperparameter tuning framework for graph models
T
Timo Stoll
RWTH Aachen University
Chendi Qian
Chendi Qian
RWTH Aachen University
Graph machine learning
Ben Finkelshtein
Ben Finkelshtein
PhD student, University of Oxford
Geometric Deep LearningGraph Representation LearningGraph Neural NetworksDeep Learning
Ali Parviz
Ali Parviz
Mila
Theoretical Computer ScienceGeometric Deep LearningGraph Neural Networks
D
Darius Weber
RWTH Aachen University
Fabrizio Frasca
Fabrizio Frasca
Posdoctoral Fellow, Technion – Israel Institute of Technology
machine learninggraph representation learninggeometric deep learningartificial intelligence
H
Hadar Shavit
RWTH Aachen University
A
Antoine Siraudin
RWTH Aachen University
A
Arman Mielke
ETAS Research, University of Stuttgart
M
Marie Anastacio
RWTH Aachen University
E
Erik Müller
RWTH Aachen University
Maya Bechler-Speicher
Maya Bechler-Speicher
Research Scientist, Meta | PhD CS@Tel-Aviv University
Machine LearningGraph Machine LearningGraph Neural Networks
Michael Bronstein
Michael Bronstein
DeepMind Professor of AI, University of Oxford / Scientific Director, AITHYRA
geometric deep learninggraph representation learninggraph neural networksprotein design
Mikhail Galkin
Mikhail Galkin
Research Scientist, Google
Graph Machine LearningKnowledge GraphsDeep LearningGeometric Deep Learning
Holger Hoos
Holger Hoos
RWTH Aachen University, Germany • Leiden University, Netherlands • University of British Columbia
Artificial IntelligenceMachine LearningAutomated ReasoningEmpirical AlgorithmicsAutomated Algorithm Design
Mathias Niepert
Mathias Niepert
University of Stuttgart & NEC Labs Europe
Machine learning
Bryan Perozzi
Bryan Perozzi
Google Research
Graph Neural NetworksMachine LearningData Mining
J
Jan Tönshoff
Microsoft Research
Christopher Morris
Christopher Morris
RWTH Aachen University
Machine learning on graphsgraph neural networksmachine learning for discrete algorithms