Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of generating realistic, controllable crash videos for traffic safety simulation—hindered by scarce real-world accident data—this paper proposes a controllable video generation framework based on diffusion models. Our method introduces three key innovations: (1) a novel multi-signal decoupled classifier-free guidance mechanism enabling independent, fine-grained control over collision types, object positions, and other attributes; (2) a multi-condition joint embedding integrating bounding boxes, collision semantics, and initial frames, augmented with physics-aware consistency modeling to improve dynamic plausibility; and (3) support for counterfactual scenario generation, where minor input perturbations induce substantially divergent collision trajectories. Experiments demonstrate state-of-the-art performance on quantitative metrics including FVD and JEDi. Human evaluation further confirms superior physical realism and visual quality compared to existing diffusion-based video generation approaches.

Technology Category

Application Category

📝 Abstract
Video diffusion techniques have advanced significantly in recent years; however, they struggle to generate realistic imagery of car crashes due to the scarcity of accident events in most driving datasets. Improving traffic safety requires realistic and controllable accident simulations. To tackle the problem, we propose Ctrl-Crash, a controllable car crash video generation model that conditions on signals such as bounding boxes, crash types, and an initial image frame. Our approach enables counterfactual scenario generation where minor variations in input can lead to dramatically different crash outcomes. To support fine-grained control at inference time, we leverage classifier-free guidance with independently tunable scales for each conditioning signal. Ctrl-Crash achieves state-of-the-art performance across quantitative video quality metrics (e.g., FVD and JEDi) and qualitative measurements based on a human-evaluation of physical realism and video quality compared to prior diffusion-based methods.
Problem

Research questions and friction points this paper is trying to address.

Generating realistic car crash videos with diffusion models
Enabling controllable accident simulations for traffic safety
Improving fine-grained control in crash scenario generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Controllable car crash video generation model
Classifier-free guidance for fine-grained control
State-of-the-art performance in video quality metrics
🔎 Similar Papers
No similar papers found.