🤖 AI Summary
Graph learning holds significant promise for applications such as drug design, yet existing benchmarks suffer from structural deficiencies—overemphasizing 2D molecular graphs while neglecting high-impact domains like combinatorial optimization and chip design; moreover, data abstraction is distorted, and evaluation is fragmented and accuracy-centric, leading to model overfitting and poor generalization. Method: This work systematically identifies the root causes of benchmark failure and proposes a paradigm shift in benchmark reconstruction, grounded in three core principles: real-world impact, cross-domain generalizability, and domain synergy. We introduce a critical analytical framework, define meta-criteria for benchmark evaluation, and advocate interdisciplinary collaboration. Contribution/Results: The study establishes a new, trustworthy, scalable, and reproducible evaluation methodology for graph foundation models, advancing the field toward problem-driven scientific development.
📝 Abstract
While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current benchmarking practices often lack focus on transformative, real-world applications, favoring narrow domains like two-dimensional molecular graphs over broader, impactful areas such as combinatorial optimization, relational databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to inadequate abstractions and misaligned use cases. Fragmented evaluations and an excessive focus on accuracy further exacerbate these issues, incentivizing overfitting rather than fostering generalizable insights. These limitations have prevented the development of truly useful graph foundation models. This position paper calls for a paradigm shift toward more meaningful benchmarks, rigorous evaluation protocols, and stronger collaboration with domain experts to drive impactful and reliable advances in graph learning research, unlocking the potential of graph learning.