Reason--Imagine--Act: Closed-Loop LLM Decision Making with World Models for Autonomous Driving

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing semantic-driven driving decision-making methods struggle to simultaneously achieve accurate intention understanding and physical feasibility in dynamic traffic, often resulting in unsafe behaviors. This work proposes RIA, a closed-loop decision framework that, for the first time in autonomous driving, enables real-time coupling of large language models with action-conditional world models. RIA generates candidate actions through online short-horizon rollouts and selects among them using a safety-scoring mechanism, thereby guaranteeing the safety and executability of each decision step. Evaluated on CARLA point-to-point navigation tasks, RIA achieves a route completion rate of 80.05%, a success rate of 51.10%, and an exceptionally low collision rate of only 0.20%, significantly outperforming training-free baselines and demonstrating its effectiveness and state-of-the-art performance.
📝 Abstract
Large language models (LLMs) are promising for autonomous driving, but semantics-only decision policies can yield physically unsafe behavior in dynamic traffic. Existing methods either perform online language reasoning without explicit dynamics verification or use world models mainly in offline pipelines, leaving a gap between semantic intent and physical feasibility at decision time. We propose Reason--Imagine--Act (RIA), a closed-loop framework that couples an LLM reasoner with an action-conditioned world model for online safety verification. At each step, the LLM proposes an action template and candidate sub-actions, the world model performs short-horizon rollouts, and a safety scorer selects the safest executable action with feedback to the next reasoning step. Under a unified CARLA point-goal protocol (1000 episodes), RIA achieves 80.05% route completion, 51.10% arrival rate, and 0.20% collision rate. Under the same closed-loop interface, RIA consistently outperforms training-free baselines, including CARLA TM and MADA, on core closed-loop metrics. For reproducibility, code is available at https://github.com/pku-smart-city/source_code/tree/main/RIA.
Problem

Research questions and friction points this paper is trying to address.

autonomous driving
large language models
world models
closed-loop decision making
physical safety
Innovation

Methods, ideas, or system contributions that make the work stand out.

closed-loop decision making
world models
large language models
autonomous driving
safety verification
Z
Zhengqi Sun
Department of Information Management, Peking University, Beijing 100871, China
Yiwen Sun
Yiwen Sun
Institute for AI, Peking University
Intelligent Transportation SystemSpatiotemporal data miningSequence learning
B
Boxuan Liu
Yuanpei College, Peking University, Beijing 100871, China
T
Tailai Chen
China Agricultural University, Beijing, China
T
Tianxu Guo
CRSC Research & Design Institute Group Co., Ltd., Beijing, China
J
Jiabin Liu
CRSC Research & Design Institute Group Co., Ltd., Beijing, China