🤖 AI Summary
Graph neural networks (GNNs) suffer from insufficient robustness and reliability in open-set scenarios due to unseen classes, yet no comprehensive benchmark exists for evaluating open-set recognition (OSR) on graph-structured data. Method: We introduce the first holistic Graph Open-Set Recognition (GOSR) benchmark, covering both node-level and graph-level tasks across heterogeneous, multi-domain graph datasets. It unifies, for the first time, three previously disjoint tasks—Graph Out-of-Distribution Detection (GOODD), GOSR, and Graph Anomaly Detection (GAD)—under a coherent conceptual framework. Our standardized evaluation framework spans cross-layer feature modeling, multi-granularity discrimination strategies, and unified protocol design. Contribution/Results: We release a reproducible platform with baseline implementations, revealing critical limitations of existing methods in generalization, confidence calibration, and computational efficiency. This work fills a fundamental gap in robustness assessment for graph learning in open-world settings and advances the reliable deployment of GNNs in real-world applications.
📝 Abstract
Graph Neural Networks (GNNs) have achieved significant success in machine learning, with wide applications in social networks, bioinformatics, knowledge graphs, and other fields. Most research assumes ideal closed-set environments. However, in real-world open-set environments, graph learning models face challenges in robustness and reliability due to unseen classes. This highlights the need for Graph Open-Set Recognition (GOSR) methods to address these issues and ensure effective GNN application in practical scenarios. Research in GOSR is in its early stages, with a lack of a comprehensive benchmark spanning diverse tasks and datasets to evaluate methods. Moreover, traditional methods, Graph Out-of-Distribution Detection (GOODD), GOSR, and Graph Anomaly Detection (GAD) have mostly evolved in isolation, with little exploration of their interconnections or potential applications to GOSR. To fill these gaps, we introduce extbf{G-OSR}, a comprehensive benchmark for evaluating GOSR methods at both the node and graph levels, using datasets from multiple domains to ensure fair and standardized comparisons of effectiveness and efficiency across traditional, GOODD, GOSR, and GAD methods. The results offer critical insights into the generalizability and limitations of current GOSR methods and provide valuable resources for advancing research in this field through systematic analysis of diverse approaches.