GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks

📅 2023-10-05

🏛️ Trans. Mach. Learn. Res.

📈 Citations: 6

✨ Influential: 0

career value

192K/year

🤖 AI Summary

To address the memory explosion and scalability bottlenecks in deep Graph Neural Networks (GNNs) caused by exponential receptive-field expansion, this paper proposes the first end-to-end learnable adaptive graph sampling framework. Unlike existing methods relying on static heuristics, our approach introduces a GNN-based meta-sampler that jointly trains with the downstream GNN via gradient coupling, dynamically predicting node importance for the target task and optimizing the sampling distribution through a task-aware probabilistic mechanism. We further demonstrate, for the first time, its generalizability and robustness on heterophilous graphs under multi-label settings. Experiments show that our method consistently outperforms mainstream static sampling baselines on both homophilous and heterophilous graphs, achieves higher accuracy in low-data regimes, and enables efficient training on billion-scale graphs.

📝 Abstract

Graph neural networks (GNNs) learn to represent nodes by aggregating information from their neighbors. As GNNs increase in depth, their receptive field grows exponentially, leading to high memory costs. Several existing methods address this by sampling a small subset of nodes, scaling GNNs to much larger graphs. These methods are primarily evaluated on homophilous graphs, where neighboring nodes often share the same label. However, most of these methods rely on static heuristics that may not generalize across different graphs or tasks. We argue that the sampling method should be adaptive, adjusting to the complex structural properties of each graph. To this end, we introduce GRAPES, an adaptive sampling method that learns to identify the set of nodes crucial for training a GNN. GRAPES trains a second GNN to predict node sampling probabilities by optimizing the downstream task objective. We evaluate GRAPES on various node classification benchmarks, involving homophilous as well as heterophilous graphs. We demonstrate GRAPES' effectiveness in accuracy and scalability, particularly in multi-label heterophilous graphs. Unlike other sampling methods, GRAPES maintains high accuracy even with smaller sample sizes and, therefore, can scale to massive graphs. Our code is publicly available at https://github.com/dfdazac/grapes.

Problem

Research questions and friction points this paper is trying to address.

Addressing high memory costs in deep GNNs via adaptive sampling

Overcoming limitations of static heuristics in graph sampling methods

Improving scalability and accuracy for heterophilous graph node classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive sampling method learning crucial nodes

Uses second GNN to predict sampling probabilities

Optimizes downstream task objective directly

🔎 Similar Papers

No similar papers found.