i-WiViG: Interpretable Window Vision GNN

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Graph neural networks (GNNs) for remote sensing image analysis suffer from poor interpretability, high distortion in post-hoc explanations, and insufficient sparsity. Method: We propose a self-explaining GNN framework tailored for vision tasks. Its core innovations are: (1) window-constrained local receptive field graph encoding, balancing long-range dependency modeling with local structural preservation; and (2) a novel self-explaining graph bottleneck mechanism that intrinsically generates faithful, sparse subgraph-level explanations during forward propagation—eliminating post-hoc distortion. The method requires no auxiliary explainer and jointly outputs predictions and their corresponding critical subgraphs. Results: On remote sensing classification and regression benchmarks, our model achieves state-of-the-art performance. It reduces explanation distortion significantly compared to existing Vision GNNs while maintaining >85% subgraph sparsity, thereby enhancing model trustworthiness and practical utility.

Technology Category

Application Category

📝 Abstract
Deep learning models based on graph neural networks have emerged as a popular approach for solving computer vision problems. They encode the image into a graph structure and can be beneficial for efficiently capturing the long-range dependencies typically present in remote sensing imagery. However, an important drawback of these methods is their black-box nature which may hamper their wider usage in critical applications. In this work, we tackle the self-interpretability of the graph-based vision models by proposing our Interpretable Window Vision GNN (i-WiViG) approach, which provides explanations by automatically identifying the relevant subgraphs for the model prediction. This is achieved with window-based image graph processing that constrains the node receptive field to a local image region and by using a self-interpretable graph bottleneck that ranks the importance of the long-range relations between the image regions. We evaluate our approach to remote sensing classification and regression tasks, showing it achieves competitive performance while providing inherent and faithful explanations through the identified relations. Further, the quantitative evaluation reveals that our model reduces the infidelity of post-hoc explanations compared to other Vision GNN models, without sacrificing explanation sparsity.
Problem

Research questions and friction points this paper is trying to address.

Enhance interpretability of graph-based vision models.
Identify relevant subgraphs for model predictions.
Reduce infidelity of explanations in Vision GNNs.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Window-based image graph processing
Self-interpretable graph bottleneck
Identifies relevant subgraphs for predictions
🔎 Similar Papers
No similar papers found.