🤖 AI Summary
This work addresses the scarcity of controllable and realistic safety-critical scenarios in autonomous driving evaluation, which are rare and difficult to collect in the real world. The authors propose a two-stage generation framework: first, a large language model synthesizes controllable driving behaviors within closed-loop simulation under natural language constraints; then, trajectory anchor points extracted from these behaviors guide a diffusion model to regenerate high-fidelity trajectories that preserve user intent while enhancing realism. By integrating the semantic controllability of large language models with the distributional fidelity of diffusion models—and incorporating a planning-based evaluation feedback mechanism—the approach demonstrates superior performance over existing methods in terms of scenario criticality, realism, and controllability, as validated on the highD dataset.
📝 Abstract
Autonomous driving systems require comprehensive evaluation in safety-critical scenarios to ensure safety and robustness. However, such scenarios are rare and difficult to collect from real-world driving data, necessitating simulation-based synthesis. Yet, existing methods often exhibit limitations in both controllability and realism. From a capability perspective, LLMs excel at controllable generation guided by natural language instructions, while diffusion models are better suited for producing trajectories consistent with realistic driving distributions. Leveraging their complementary strengths, we propose AnchorDrive, a two-stage safety-critical scenario generation framework. In the first stage, we deploy an LLM as a driver agent within a closed-loop simulation, which reasons and iteratively outputs control commands under natural language constraints; a plan assessor reviews these commands and provides corrective feedback, enabling semantically controllable scenario generation. In the second stage, the LLM extracts key anchor points from the first-stage trajectories as guidance objectives, which jointly with other guidance terms steer the diffusion model to regenerate complete trajectories with improved realism while preserving user-specified intent. Experiments on the highD dataset demonstrate that AnchorDrive achieves superior overall performance in criticality, realism, and controllability, validating its effectiveness for generating controllable and realistic safety-critical scenarios.