🤖 AI Summary
Existing faithfulness metrics for Graph Neural Network (GNN) explanations suffer from fundamental inconsistency—high faithfulness under one metric does not guarantee it under another, hindering reliable evaluation.
Method: Through rigorous theoretical analysis and extensive empirical validation, we systematically investigate the conditions and limits of faithfulness in GNN explanation.
Contribution/Results: We establish three key findings: (1) Enforcing perfect faithfulness on standard GNNs inevitably collapses explanations into uninformative constant outputs; (2) Faithfulness is intrinsically tied to architecture—self-explaining and domain-invariant architectures inherently possess higher faithfulness potential; (3) Faithfulness is tightly coupled with out-of-distribution (OOD) generalization—identifying domain-invariant subgraphs alone is insufficient for robust generalization; faithful explanations are additionally required. Our work formally delineates the boundaries, prerequisites, and design implications of faithfulness, establishing a theoretical foundation and practical guidelines for developing faithful, interpretable GNNs.
📝 Abstract
As Graph Neural Networks (GNNs) become more pervasive, it becomes paramount to build reliable tools for explaining their predictions. A core desideratum is that explanations are extit{faithful}, ie that they portray an accurate picture of the GNN's reasoning process. However, a number of different faithfulness metrics exist, begging the question of what is faithfulness exactly and how to achieve it. We make three key contributions. We begin by showing that extit{existing metrics are not interchangeable} -- ie explanations attaining high faithfulness according to one metric may be unfaithful according to others -- and can systematically ignore important properties of explanations. We proceed to show that, surprisingly, extit{optimizing for faithfulness is not always a sensible design goal}. Specifically, we prove that for injective regular GNN architectures, perfectly faithful explanations are completely uninformative. This does not apply to modular GNNs, such as self-explainable and domain-invariant architectures, prompting us to study the relationship between architectural choices and faithfulness. Finally, we show that extit{faithfulness is tightly linked to out-of-distribution generalization}, in that simply ensuring that a GNN can correctly recognize the domain-invariant subgraph, as prescribed by the literature, does not guarantee that it is invariant unless this subgraph is also faithful.The code is publicly available on GitHub