🤖 AI Summary
Large language models (LLMs) exhibit hallucination risks and lack explicit vehicle dynamics modeling, undermining safety and regulatory compliance in knowledge-driven autonomous driving decision-making.
Method: This paper proposes the first decision-making framework integrating LLMs with formal reachability analysis. It encodes LLM-generated and ranked driving maneuvers as temporal logic formulas, then jointly verifies them against traffic regulations and a precise vehicle dynamics model via reachability analysis—enabling provably safe and compliant decisions.
Contribution/Results: The method achieves tight coupling between symbolic knowledge reasoning and rigorous mathematical verification for the first time. It guarantees both safety and legal compliance in high-density traffic scenarios, supporting both open-loop and closed-loop verification. To foster reproducibility and further research, the implementation code and experimental configurations are publicly released.
📝 Abstract
Large language models have been widely applied to knowledge-driven decision-making for automated vehicles due to their strong generalization and reasoning capabilities. However, the safety of the resulting decisions cannot be ensured due to possible hallucinations and the lack of integrated vehicle dynamics. To address this issue, we propose SanDRA, the first safe large-language-model-based decision making framework for automated vehicles using reachability analysis. Our approach starts with a comprehensive description of the driving scenario to prompt large language models to generate and rank feasible driving actions. These actions are translated into temporal logic formulas that incorporate formalized traffic rules, and are subsequently integrated into reachability analysis to eliminate unsafe actions. We validate our approach in both open-loop and closed-loop driving environments using off-the-shelf and finetuned large language models, showing that it can provide provably safe and, where possible, legally compliant driving actions, even under high-density traffic conditions. To ensure transparency and facilitate future research, all code and experimental setups are publicly available at github.com/CommonRoad/SanDRA.