🤖 AI Summary
This work addresses the inherent lack of interpretability in graph neural networks (GNNs), which typically struggle to provide accurate, instance-level explanations efficiently. To overcome this limitation, the authors propose B-cos GNNs, the first framework enabling exact decomposition of GNN predictions. By replacing the nonlinear message-passing and update functions in Graph Isomorphism Networks (GINs) with input-dependent dynamic linear mappings derived from the B-cos transform, the model directly outputs the contribution of each node and feature to the final prediction during forward propagation—without requiring any post-hoc explainer or input perturbation. This approach preserves competitive predictive performance while substantially improving explanation fidelity and computational efficiency, achieving state-of-the-art interpretability on multiple synthetic and real-world datasets and delivering explanations orders of magnitude faster than existing post-hoc methods.
📝 Abstract
We introduce B-cos GNNs, an inherently explainable class of graph neural networks whose predictions decompose exactly into per-node, per-feature contributions via a single input-dependent linear map. B-cos GNNs use linear (sum-based) aggregation and replace non-linear message and update functions with B-cos transforms. This induces meaningful, task-specific weight-input alignment that is directly accessible through the model's dynamic linearity. Instance-level explanations follow from a single forward and backward pass, requiring no auxiliary explainer, modified learning objective, or perturbation procedure. Instantiated as a GIN, our approach trades small losses in predictive accuracy for state-of-the-art explainability across diverse synthetic and real-world benchmarks, producing explanations orders of magnitude faster than post-hoc baselines.