GRAFT: Auditing Graph Neural Networks via Global Feature Attribution

📅 2026-05-05

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This work addresses the lack of global interpretability at the node feature level in graph neural networks (GNNs), which hinders the identification of features driving predictions. To overcome this limitation, the authors propose GRAFT, a novel framework that achieves, for the first time, global feature attribution at the node attribute level. GRAFT integrates diversity-guided sample selection, integrated gradients for attribution, and class-level aggregation to construct comprehensive global feature importance profiles, which are then distilled into natural language rules using a large language model. Experiments across multiple datasets and GNN architectures demonstrate that GRAFT accurately captures the features relied upon by the models. The generated rules exhibit high accuracy and practical utility in structured human evaluations, effectively supporting bias analysis and enabling efficient transfer learning.

📝 Abstract

Graph Neural Networks (GNNs) achieve strong performance on node classification tasks but remain difficult to interpret, particularly with respect to which input features drive their predictions. Existing global GNN explainers operate at the structural level identifying recurring subgraph motifs, but none explain model behaviour globally at the level of input node attributes. We propose GRAFT, a posthoc global explanation framework that identifies class-level feature importance profiles for GNNs. The method combines diversity-guided exemplar selection, Integrated Gradients-based attribution, and aggregation to construct a global view of feature influence for each class, which can be further expressed as concise natural language rules using a large language model with self-refinement. We evaluate GRAFT across multiple datasets, architectures, and experimental settings, demonstrating its effectiveness in capturing model-relevant features, supporting bias analysis, and enabling feature-efficient transfer learning. In addition, we introduce a structured human evaluation protocol to assess the interpretability of generated rules along dimensions such as accuracy and usefulness. Our results suggest that GRAFT provides a practical and interpretable approach for analysing feature-level behaviour in GNNs, bridging quantitative attribution with human-understandable explanations.

Problem

Research questions and friction points this paper is trying to address.

Graph Neural Networks

Interpretability

Feature Attribution

Global Explanation

Node Classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Global Explanation

Feature Attribution

Graph Neural Networks