🤖 AI Summary
To address the challenge of generating realistic, controllable crash videos for traffic safety simulation—hindered by scarce real-world accident data—this paper proposes a controllable video generation framework based on diffusion models. Our method introduces three key innovations: (1) a novel multi-signal decoupled classifier-free guidance mechanism enabling independent, fine-grained control over collision types, object positions, and other attributes; (2) a multi-condition joint embedding integrating bounding boxes, collision semantics, and initial frames, augmented with physics-aware consistency modeling to improve dynamic plausibility; and (3) support for counterfactual scenario generation, where minor input perturbations induce substantially divergent collision trajectories. Experiments demonstrate state-of-the-art performance on quantitative metrics including FVD and JEDi. Human evaluation further confirms superior physical realism and visual quality compared to existing diffusion-based video generation approaches.
📝 Abstract
Video diffusion techniques have advanced significantly in recent years; however, they struggle to generate realistic imagery of car crashes due to the scarcity of accident events in most driving datasets. Improving traffic safety requires realistic and controllable accident simulations. To tackle the problem, we propose Ctrl-Crash, a controllable car crash video generation model that conditions on signals such as bounding boxes, crash types, and an initial image frame. Our approach enables counterfactual scenario generation where minor variations in input can lead to dramatically different crash outcomes. To support fine-grained control at inference time, we leverage classifier-free guidance with independently tunable scales for each conditioning signal. Ctrl-Crash achieves state-of-the-art performance across quantitative video quality metrics (e.g., FVD and JEDi) and qualitative measurements based on a human-evaluation of physical realism and video quality compared to prior diffusion-based methods.