🤖 AI Summary
Existing autonomous driving simulators rely on rasterized scene encodings—leading to high parameter counts, computational redundancy—and rule-based agent behaviors that lack realism and diversity. This work introduces the first planning-oriented, fully data-driven generative simulator. Methodologically, it (1) proposes a novel vectorized latent diffusion model that jointly encodes lane graphs and agent bounding boxes, eliminating inefficient raster representations; (2) designs a data-driven autoregressive Transformer to synthesize high-fidelity, diverse, closed-loop agent trajectories; and (3) enables infinite out-of-distribution scene extrapolation via diffusion-based inpainting. Experiments demonstrate that, compared to the strongest baseline, our method achieves superior generation quality while reducing model parameters by ~2×, inference latency by 6×, and GPU training time by 10×. Moreover, it significantly improves the generalization of reinforcement learning planners in long-horizon and adversarial scenarios.
📝 Abstract
We introduce Scenario Dreamer, a fully data-driven generative simulator for autonomous vehicle planning that generates both the initial traffic scene - comprising a lane graph and agent bounding boxes - and closed-loop agent behaviours. Existing methods for generating driving simulation environments encode the initial traffic scene as a rasterized image and, as such, require parameter-heavy networks that perform unnecessary computation due to many empty pixels in the rasterized scene. Moreover, we find that existing methods that employ rule-based agent behaviours lack diversity and realism. Scenario Dreamer instead employs a novel vectorized latent diffusion model for initial scene generation that directly operates on the vectorized scene elements and an autoregressive Transformer for data-driven agent behaviour simulation. Scenario Dreamer additionally supports scene extrapolation via diffusion inpainting, enabling the generation of unbounded simulation environments. Extensive experiments show that Scenario Dreamer outperforms existing generative simulators in realism and efficiency: the vectorized scene-generation base model achieves superior generation quality with around 2x fewer parameters, 6x lower generation latency, and 10x fewer GPU training hours compared to the strongest baseline. We confirm its practical utility by showing that reinforcement learning planning agents are more challenged in Scenario Dreamer environments than traditional non-generative simulation environments, especially on long and adversarial driving environments.