Drive&Gen: Co-Evaluating End-to-End Driving and Video Generation Models

📅 2025-10-07

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This work addresses two key challenges in end-to-end (E2E) autonomous driving planner evaluation: scarcity of real-world video data and poor out-of-distribution (OOD) generalization. We propose a novel co-evaluation paradigm integrating video generation models with E2E driving models. Methodologically, we construct a controllable virtual testbed and—uniquely—employ the E2E planner as a discriminator to enable closed-loop, policy-feedback-driven quantification of video realism. Our framework incorporates controlled experimental design, statistical significance testing, and interactive realism metrics to support targeted attribution analysis of distributional shifts. Experiments demonstrate high-fidelity, controllable video synthesis under complex conditions (e.g., weather, viewpoint, traffic density) and show that synthetic data substantially improves OOD generalization in novel operational domains, reducing reliance on real-world collection. The core contribution is the first driving-centric, closed-loop video generation evaluation framework, advancing trustworthy, interpretable, and cost-efficient autonomous system validation.

Technology Category

Application Category

📝 Abstract

Recent advances in generative models have sparked exciting new possibilities in the field of autonomous vehicles. Specifically, video generation models are now being explored as controllable virtual testing environments. Simultaneously, end-to-end (E2E) driving models have emerged as a streamlined alternative to conventional modular autonomous driving systems, gaining popularity for their simplicity and scalability. However, the application of these techniques to simulation and planning raises important questions. First, while video generation models can generate increasingly realistic videos, can these videos faithfully adhere to the specified conditions and be realistic enough for E2E autonomous planner evaluation? Second, given that data is crucial for understanding and controlling E2E planners, how can we gain deeper insights into their biases and improve their ability to generalize to out-of-distribution scenarios? In this work, we bridge the gap between the driving models and generative world models (Drive&Gen) to address these questions. We propose novel statistical measures leveraging E2E drivers to evaluate the realism of generated videos. By exploiting the controllability of the video generation model, we conduct targeted experiments to investigate distribution gaps affecting E2E planner performance. Finally, we show that synthetic data produced by the video generation model offers a cost-effective alternative to real-world data collection. This synthetic data effectively improves E2E model generalization beyond existing Operational Design Domains, facilitating the expansion of autonomous vehicle services into new operational contexts.

Problem

Research questions and friction points this paper is trying to address.

Evaluating realism of generated videos for autonomous planner testing

Investigating distribution gaps affecting end-to-end driving model performance

Using synthetic video data to improve autonomous vehicle generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using E2E drivers to evaluate video realism

Leveraging controllable video generation for testing

Synthetic data improves model generalization

🔎 Similar Papers

DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes