🤖 AI Summary
Subgraph counting suffers from the absence of standardized evaluation frameworks, fragmented datasets, and scarce ground-truth benchmarks—hindering fair comparison between algorithmic (AL) and machine learning (ML) approaches. To address this, we introduce BEACON, the first comprehensive benchmark comprising six real-world and synthetic graphs and twelve subgraph patterns, with standardized datasets, verifiable ground-truth counts, an integrated evaluation environment, and a public leaderboard. Our systematic analysis reveals a fundamental trade-off: AL methods scale to massive graphs but degrade with pattern complexity, whereas ML methods generalize to complex patterns yet require large-scale labeled data and underperform on small, dense graphs. By unifying classical graph algorithms (e.g., enumeration pruning, path compression) with GNNs and regression models, we establish a cross-paradigm evaluation protocol and a multidimensional metric suite. This clarifies performance boundaries across method classes and advances community-wide consensus on rigorous, reproducible evaluation standards.
📝 Abstract
Subgraph counting the task of determining the number of instances of a query pattern within a large graph lies at the heart of many critical applications, from analyzing financial networks and transportation systems to understanding biological interactions. Despite decades of work yielding efficient algorithmic (AL) solutions and, more recently, machine learning (ML) approaches, a clear comparative understanding is elusive. This gap stems from the absence of a unified evaluation framework, standardized datasets, and accessible ground truths, all of which hinder systematic analysis and fair benchmarking. To overcome these barriers, we introduce BEACON: a comprehensive benchmark designed to rigorously evaluate both AL and ML-based subgraph counting methods. BEACON provides a standardized dataset with verified ground truths, an integrated evaluation environment, and a public leaderboard, enabling reproducible and transparent comparisons across diverse approaches. Our extensive experiments reveal that while AL methods excel in efficiently counting subgraphs on very large graphs, they struggle with complex patterns (e.g., those exceeding six nodes). In contrast, ML methods are capable of handling larger patterns but demand massive graph data inputs and often yield suboptimal accuracy on small, dense graphs. These insights not only highlight the unique strengths and limitations of each approach but also pave the way for future advancements in subgraph counting techniques. Overall, BEACON represents a significant step towards unifying and accelerating research in subgraph counting, encouraging innovative solutions and fostering a deeper understanding of the trade-offs between algorithmic and machine learning paradigms.