LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios

📅 2025-05-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of evaluating autonomous driving systems in rare, high-risk scenarios—namely, poor controllability, low interpretability, and heavy reliance on expert knowledge—this paper proposes a natural-language-driven adversarial driving scenario generation framework. Methodologically, it integrates large language models (LLMs) with latent diffusion models (LDMs) to construct an end-to-end “instruction-to-adversarial-trajectory” generation pipeline. It introduces, for the first time, an LLM-guided interpretable adversarial loss mechanism, augmented by chain-of-thought (CoT)-based code generation and debugging modules, enabling precise semantic alignment and fine-grained behavioral control. Evaluated on the nuScenes dataset, the framework achieves state-of-the-art performance: generated scenarios exhibit high realism, diversity, and robustness, significantly enhancing both the effectiveness and interpretability of targeted stress testing for autonomous driving systems.

Technology Category

Application Category

📝 Abstract
Ensuring the safety and robustness of autonomous driving systems necessitates a comprehensive evaluation in safety-critical scenarios. However, these safety-critical scenarios are rare and difficult to collect from real-world driving data, posing significant challenges to effectively assessing the performance of autonomous vehicles. Typical existing methods often suffer from limited controllability and lack user-friendliness, as extensive expert knowledge is essentially required. To address these challenges, we propose LD-Scene, a novel framework that integrates Large Language Models (LLMs) with Latent Diffusion Models (LDMs) for user-controllable adversarial scenario generation through natural language. Our approach comprises an LDM that captures realistic driving trajectory distributions and an LLM-based guidance module that translates user queries into adversarial loss functions, facilitating the generation of scenarios aligned with user queries. The guidance module integrates an LLM-based Chain-of-Thought (CoT) code generator and an LLM-based code debugger, enhancing the controllability and robustness in generating guidance functions. Extensive experiments conducted on the nuScenes dataset demonstrate that LD-Scene achieves state-of-the-art performance in generating realistic, diverse, and effective adversarial scenarios. Furthermore, our framework provides fine-grained control over adversarial behaviors, thereby facilitating more effective testing tailored to specific driving scenarios.
Problem

Research questions and friction points this paper is trying to address.

Generating rare safety-critical driving scenarios for autonomous vehicles
Overcoming limited controllability in existing adversarial scenario methods
Enabling user-friendly natural language control for scenario generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided diffusion for adversarial scenario generation
Natural language user queries to adversarial loss
Chain-of-Thought code generator for controllability
🔎 Similar Papers
No similar papers found.
Mingxing Peng
Mingxing Peng
HKUST-GZ
large language modeltrajectory generationtraffic simulation
Yuting Xie
Yuting Xie
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510275, China
X
Xusen Guo
Intelligent Transportation Thrust, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511453, China
R
Ruoyu Yao
Robotics and Autonomous Systems Thrust, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511453, China
H
Hai Yang
Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, China
J
Jun Ma
Robotics and Autonomous Systems Thrust, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511453, China, and also with the Division of Emerging Interdisciplinary Areas, The Hong Kong University of Science and Technology, Hong Kong SAR, China