🤖 AI Summary
This work addresses risk perception discrepancies in advanced driver assistance systems that stem not from component failures but from semantic ambiguity and partial observability. To this end, the authors propose a scene-centric, two-stage large language model framework that evaluates risk reasoning capabilities within deterministic, time-bounded multimodal driving scene windows, using a unified prompting strategy and a closed-set numerical risk scale. The framework enables reproducible, scene-level auditing of model behavior. Experimental results reveal systematic divergences across models in assessing risk severity, escalating high-risk scenarios, utilizing evidential cues, and performing causal attribution—particularly in interpretations involving vulnerable road users. These findings indicate that such discrepancies primarily arise from inherent semantic uncertainty, underscoring the critical need for explicit management of semantic ambiguity in safety-critical autonomous driving applications.
📝 Abstract
Advanced Driver Assistance Systems (ADAS) increasingly rely on learning-based perception, yet safety-relevant failures often arise without component malfunction, driven instead by partial observability and semantic ambiguity in how risk is interpreted and communicated. This paper presents a scenario-centric framework for reproducible auditing of LLM-based risk reasoning in urban driving contexts. Deterministic, temporally bounded scenario windows are constructed from multimodal driving data and evaluated under fixed prompt constraints and a closed numeric risk schema, ensuring structured and comparable outputs across models. Experiments on a curated near-people scenario set compare two text-only models and one multimodal model under identical inputs and prompts. Results reveal systematic inter-model divergence in severity assignment, high-risk escalation, evidence use, and causal attribution. Disagreement extends to the interpretation of vulnerable road user presence, indicating that variability often reflects intrinsic semantic indeterminacy rather than isolated model failure. These findings highlight the importance of scenario-centric auditing and explicit ambiguity management when integrating LLM-based reasoning into safety-aligned driver assistance systems.