Learning from Risk: LLM-Guided Generation of Safety-Critical Scenarios with Prior Knowledge

📅 2025-11-25

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Autonomous driving safety validation faces dual challenges—data scarcity for rare, long-tail events and insufficient controllability in complex multi-agent interactions. To address these, we propose a high-fidelity traffic scenario generation framework integrating Conditional Variational Autoencoders (CVAEs) with Large Language Models (LLMs). The CVAE encodes historical trajectories and map priors, while the LLM serves as an adversarial reasoning engine that dynamically translates unstructured semantic descriptions into risk-oriented, domain-specific loss functions, enabling physically consistent and risk-controllable scenario synthesis. Evaluated in CARLA and SMARTS, our method significantly improves coverage of high-risk and long-tail scenarios, enhances fidelity between simulated and real-world traffic distributions, and generates more challenging interactive scenarios—thereby effectively supporting stress testing of autonomous driving systems.

Technology Category

Application Category

📝 Abstract

Autonomous driving faces critical challenges in rare long-tail events and complex multi-agent interactions, which are scarce in real-world data yet essential for robust safety validation. This paper presents a high-fidelity scenario generation framework that integrates a conditional variational autoencoder (CVAE) with a large language model (LLM). The CVAE encodes historical trajectories and map information from large-scale naturalistic datasets to learn latent traffic structures, enabling the generation of physically consistent base scenarios. Building on this, the LLM acts as an adversarial reasoning engine, parsing unstructured scene descriptions into domain-specific loss functions and dynamically guiding scenario generation across varying risk levels. This knowledge-driven optimization balances realism with controllability, ensuring that generated scenarios remain both plausible and risk-sensitive. Extensive experiments in CARLA and SMARTS demonstrate that our framework substantially increases the coverage of high-risk and long-tail events, improves consistency between simulated and real-world traffic distributions, and exposes autonomous driving systems to interactions that are significantly more challenging than those produced by existing rule- or data-driven methods. These results establish a new pathway for safety validation, enabling principled stress-testing of autonomous systems under rare but consequential events.

Problem

Research questions and friction points this paper is trying to address.

Generating safety-critical scenarios for autonomous driving validation

Addressing rare long-tail events and complex multi-agent interactions

Balancing scenario realism with controllable risk levels

Innovation

Methods, ideas, or system contributions that make the work stand out.

CVAE generates physically consistent base scenarios

LLM acts as adversarial reasoning engine

Knowledge-driven optimization balances realism with controllability

🔎 Similar Papers

S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models