GraphBench: Next-generation graph learning benchmarking

📅 2025-12-04

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Graph machine learning has long suffered from benchmark fragmentation: datasets are task-specific, evaluation protocols lack standardization, and out-of-distribution (OOD) generalization is rarely considered—severely hindering reproducibility and cross-model comparison. To address this, we introduce GraphBench, the first cross-domain, multi-task graph learning benchmark platform, supporting node-, edge-, and graph-level classification as well as generative tasks. GraphBench features standardized data splits, a unified evaluation protocol, an automated hyperparameter tuning framework, and—uniquely—integrates OOD generalization metrics into its core evaluation suite. We establish authoritative baselines using message-passing GNNs and graph Transformers, conducting systematic evaluations across 12 diverse datasets. GraphBench significantly improves evaluation consistency and result comparability, providing a reproducible, scalable, and standardized infrastructure for graph learning research.

Technology Category

Application Category

📝 Abstract

Machine learning on graphs has recently achieved impressive progress in various domains, including molecular property prediction and chip design. However, benchmarking practices remain fragmented, often relying on narrow, task-specific datasets and inconsistent evaluation protocols, which hampers reproducibility and broader progress. To address this, we introduce GraphBench, a comprehensive benchmarking suite that spans diverse domains and prediction tasks, including node-level, edge-level, graph-level, and generative settings. GraphBench provides standardized evaluation protocols -- with consistent dataset splits and performance metrics that account for out-of-distribution generalization -- as well as a unified hyperparameter tuning framework. Additionally, we benchmark GraphBench using message-passing neural networks and graph transformer models, providing principled baselines and establishing a reference performance. See www.graphbench.io for further details.

Problem

Research questions and friction points this paper is trying to address.

Standardizes graph learning evaluation across diverse tasks and domains

Addresses fragmented benchmarking with consistent protocols and metrics

Establishes baselines for reproducibility and generalization in graph models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Comprehensive benchmarking suite for diverse graph tasks

Standardized evaluation protocols with consistent dataset splits

Unified hyperparameter tuning framework for graph models

🔎 Similar Papers

FedGraph: A Research Library and Benchmark for Federated Graph Learning