Explainable Scene Understanding with Qualitative Representations and Graph Neural Networks

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Current autonomous driving scene understanding methods suffer from insufficient behavioral interpretability and incomplete spatiotemporal relationship modeling. Method: We propose a holistic graph-structured reasoning framework that integrates Qualitative eXplainable Graphs (QXGs) with Graph Neural Networks (GNNs). For the first time, QXGs are deeply embedded into GNNs to construct scene-level spatiotemporal graphs, explicitly encoding multi-hop, multi-type qualitative spatial and causal relations among traffic agents—overcoming the limitation of conventional single-relation-chain modeling. Additionally, an imbalance-robust learning mechanism is introduced to enhance critical object recognition accuracy. Contribution/Results: Evaluated on the nuScenes+DriveLM joint dataset, our method significantly outperforms baselines, achieving both high recognition performance and end-to-end causal interpretability—thereby unifying accuracy and explainability in autonomous driving perception.

Technology Category

Application Category

📝 Abstract

This paper investigates the integration of graph neural networks (GNNs) with Qualitative Explainable Graphs (QXGs) for scene understanding in automated driving. Scene understanding is the basis for any further reactive or proactive decision-making. Scene understanding and related reasoning is inherently an explanation task: why is another traffic participant doing something, what or who caused their actions? While previous work demonstrated QXGs' effectiveness using shallow machine learning models, these approaches were limited to analysing single relation chains between object pairs, disregarding the broader scene context. We propose a novel GNN architecture that processes entire graph structures to identify relevant objects in traffic scenes. We evaluate our method on the nuScenes dataset enriched with DriveLM's human-annotated relevance labels. Experimental results show that our GNN-based approach achieves superior performance compared to baseline methods. The model effectively handles the inherent class imbalance in relevant object identification tasks while considering the complete spatial-temporal relationships between all objects in the scene. Our work demonstrates the potential of combining qualitative representations with deep learning approaches for explainable scene understanding in autonomous driving systems.

Problem

Research questions and friction points this paper is trying to address.

Integrating GNNs with QXGs for explainable autonomous driving scene understanding

Overcoming limitations of single relation chains by analyzing entire scene graphs

Improving relevant object identification using spatial-temporal relationship modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

GNNs process entire graph structures for scenes

Combines QXGs with deep learning for explainability

Handles class imbalance and spatial-temporal relationships

🔎 Similar Papers

No similar papers found.