Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis

📅 2025-06-13

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Autonomous driving faces challenges in generating complex, rare scenarios and achieving high fidelity for safety-critical cases. This paper systematically surveys foundational models—including large language models (LLMs), vision-language models (VLMs), multimodal LLMs, diffusion models, and world models—for scenario generation and analysis. We propose, for the first time, a unified taxonomy tailored to autonomous driving. Our structured framework encompasses methods, datasets, simulation platforms, and evaluation metrics; we introduce novel domain-specific metrics and two analytical dimensions—causal fidelity and safety-critical fidelity. We publicly release an actively maintained literature repository and supplementary materials. Synthesizing over 100 state-of-the-art works, we catalog major open-source resources and explicitly identify key bottlenecks (e.g., insufficient causal modeling, distortion of safety-critical scenarios) alongside promising future directions. This work provides a systematic foundation for enhancing both diversity and realism in autonomous driving scenario generation.

Technology Category

Application Category

📝 Abstract

For autonomous vehicles, safe navigation in complex environments depends on handling a broad range of diverse and rare driving scenarios. Simulation- and scenario-based testing have emerged as key approaches to development and validation of autonomous driving systems. Traditional scenario generation relies on rule-based systems, knowledge-driven models, and data-driven synthesis, often producing limited diversity and unrealistic safety-critical cases. With the emergence of foundation models, which represent a new generation of pre-trained, general-purpose AI models, developers can process heterogeneous inputs (e.g., natural language, sensor data, HD maps, and control actions), enabling the synthesis and interpretation of complex driving scenarios. In this paper, we conduct a survey about the application of foundation models for scenario generation and scenario analysis in autonomous driving (as of May 2025). Our survey presents a unified taxonomy that includes large language models, vision-language models, multimodal large language models, diffusion models, and world models for the generation and analysis of autonomous driving scenarios. In addition, we review the methodologies, open-source datasets, simulation platforms, and benchmark challenges, and we examine the evaluation metrics tailored explicitly to scenario generation and analysis. Finally, the survey concludes by highlighting the open challenges and research questions, and outlining promising future research directions. All reviewed papers are listed in a continuously maintained repository, which contains supplementary materials and is available at https://github.com/TUM-AVS/FM-for-Scenario-Generation-Analysis.

Problem

Research questions and friction points this paper is trying to address.

Enhancing diversity in autonomous driving scenario generation

Improving realism of safety-critical driving cases

Integrating foundation models for scenario analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Foundation models enable diverse scenario generation

Multimodal inputs processed for complex scenarios

Unified taxonomy for autonomous driving scenarios

🔎 Similar Papers

No similar papers found.