🤖 AI Summary
Graph Neural Network (GNN) explanation methods suffer from prohibitively high computational cost in edge and feature attribution, limiting scalability to large-scale graphs.
Method: We propose the first distributed Shapley value attribution framework supporting million-scale feature dimensions. Our approach introduces a multi-GPU collaborative subgraph sampling strategy, integrated with parallel GNN inference and distributed least-squares optimization to efficiently compute edge importance scores.
Contribution/Results: We extend Shapley value computation from single-machine constraints to exascale-class distributed environments—achieving near-linear speedup on 128 GPUs of the NERSC Perlmutter supercomputer. Experiments demonstrate that our method significantly outperforms state-of-the-art GNN explainers in explanation fidelity while reducing attribution latency by one to two orders of magnitude on graphs with millions of nodes and edges. This work establishes a new paradigm for scalable, high-fidelity GNN explanation.
📝 Abstract
With the growing adoption of graph neural networks (GNNs), explaining their predictions has become increasingly important. However, attributing predictions to specific edges or features remains computationally expensive. For example, classifying a node with 100 neighbors using a 3-layer GNN may involve identifying important edges from millions of candidates contributing to the prediction. To address this challenge, we propose DistShap, a parallel algorithm that distributes Shapley value-based explanations across multiple GPUs. DistShap operates by sampling subgraphs in a distributed setting, executing GNN inference in parallel across GPUs, and solving a distributed least squares problem to compute edge importance scores. DistShap outperforms most existing GNN explanation methods in accuracy and is the first to scale to GNN models with millions of features by using up to 128 GPUs on the NERSC Perlmutter supercomputer.