Why Braking? Scenario Extraction and Reasoning Utilizing LLM

📅 2025-07-17

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Identifying safety-critical braking corner cases in massive autonomous driving datasets remains challenging due to their rarity and poor interpretability. Method: This paper proposes a dual-path braking attribution analysis framework powered by large language models (LLMs). It introduces a synergistic retrieval mechanism—combining category-driven (for known scenarios) and embedding-driven (for out-of-distribution, unseen scenarios) approaches—to map raw sensor signals to natural-language situational descriptions. The framework integrates numerical signal analysis with LLM-based reasoning to construct an interpretable braking causality classification model. Results: Evaluated on the Argoverse 2 Sensor Dataset, our method significantly outperforms conventional rule-based baselines. It demonstrates strong generalization in complex urban environments and delivers high interpretability, establishing a novel paradigm for corner-case mining and safety validation in ADAS systems.

Technology Category

Application Category

📝 Abstract

The growing number of ADAS-equipped vehicles has led to a dramatic increase in driving data, yet most of them capture routine driving behavior. Identifying and understanding safety-critical corner cases within this vast dataset remains a significant challenge. Braking events are particularly indicative of potentially hazardous situations, motivating the central question of our research: Why does a vehicle brake? Existing approaches primarily rely on rule-based heuristics to retrieve target scenarios using predefined condition filters. While effective in simple environments such as highways, these methods lack generalization in complex urban settings. In this paper, we propose a novel framework that leverages Large Language Model (LLM) for scenario understanding and reasoning. Our method bridges the gap between low-level numerical signals and natural language descriptions, enabling LLM to interpret and classify driving scenarios. We propose a dual-path scenario retrieval that supports both category-based search for known scenarios and embedding-based retrieval for unknown Out-of-Distribution (OOD) scenarios. To facilitate evaluation, we curate scenario annotations on the Argoverse 2 Sensor Dataset. Experimental results show that our method outperforms rule-based baselines and generalizes well to OOD scenarios.

Problem

Research questions and friction points this paper is trying to address.

Identifying safety-critical corner cases in vast driving data

Understanding braking events in complex urban environments

Generalizing scenario retrieval for known and unknown cases

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages LLM for scenario understanding and reasoning

Bridges numerical signals and natural language descriptions

Dual-path retrieval for known and unknown scenarios

🔎 Similar Papers

Racing Thoughts: Explaining Large Language Model Contextualization Errors