GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks

📅 2023-10-05
🏛️ Trans. Mach. Learn. Res.
📈 Citations: 6
Influential: 0
📄 PDF
🤖 AI Summary
To address the memory explosion and scalability bottlenecks in deep Graph Neural Networks (GNNs) caused by exponential receptive-field expansion, this paper proposes the first end-to-end learnable adaptive graph sampling framework. Unlike existing methods relying on static heuristics, our approach introduces a GNN-based meta-sampler that jointly trains with the downstream GNN via gradient coupling, dynamically predicting node importance for the target task and optimizing the sampling distribution through a task-aware probabilistic mechanism. We further demonstrate, for the first time, its generalizability and robustness on heterophilous graphs under multi-label settings. Experiments show that our method consistently outperforms mainstream static sampling baselines on both homophilous and heterophilous graphs, achieves higher accuracy in low-data regimes, and enables efficient training on billion-scale graphs.
📝 Abstract
Graph neural networks (GNNs) learn to represent nodes by aggregating information from their neighbors. As GNNs increase in depth, their receptive field grows exponentially, leading to high memory costs. Several existing methods address this by sampling a small subset of nodes, scaling GNNs to much larger graphs. These methods are primarily evaluated on homophilous graphs, where neighboring nodes often share the same label. However, most of these methods rely on static heuristics that may not generalize across different graphs or tasks. We argue that the sampling method should be adaptive, adjusting to the complex structural properties of each graph. To this end, we introduce GRAPES, an adaptive sampling method that learns to identify the set of nodes crucial for training a GNN. GRAPES trains a second GNN to predict node sampling probabilities by optimizing the downstream task objective. We evaluate GRAPES on various node classification benchmarks, involving homophilous as well as heterophilous graphs. We demonstrate GRAPES' effectiveness in accuracy and scalability, particularly in multi-label heterophilous graphs. Unlike other sampling methods, GRAPES maintains high accuracy even with smaller sample sizes and, therefore, can scale to massive graphs. Our code is publicly available at https://github.com/dfdazac/grapes.
Problem

Research questions and friction points this paper is trying to address.

Addressing high memory costs in deep GNNs via adaptive sampling
Overcoming limitations of static heuristics in graph sampling methods
Improving scalability and accuracy for heterophilous graph node classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive sampling method learning crucial nodes
Uses second GNN to predict sampling probabilities
Optimizes downstream task objective directly
🔎 Similar Papers
No similar papers found.