🤖 AI Summary
Existing view synthesis methods for autonomous driving achieve strong interpolation performance on recorded trajectories but exhibit poor generalization to unseen free-form camera paths, hindering closed-loop evaluation of end-to-end driving policies. To address this, we propose the first multi-trajectory joint optimization framework that integrates a video diffusion model as a generative prior, synergistically combining inverse rendering with learned spatiotemporal priors to iteratively reconstruct a 3D scene representation. Our method employs differentiable Gaussian splatting for efficient, high-fidelity rendering and enables temporally consistent novel-view synthesis along arbitrary user-specified trajectories. Experiments demonstrate substantial improvements in geometric consistency and dynamic detail fidelity on unseen trajectories compared to prior approaches. This work establishes a new paradigm for photorealistic driving simulation and AI-driven virtual world construction.
📝 Abstract
Driving scene synthesis along free-form trajectories is essential for driving simulations to enable closed-loop evaluation of end-to-end driving policies. While existing methods excel at novel view synthesis on recorded trajectories, they face challenges with novel trajectories due to limited views of driving videos and the vastness of driving environments. To tackle this challenge, we propose a novel free-form driving view synthesis approach, dubbed DriveX, by leveraging video generative prior to optimize a 3D model across a variety of trajectories. Concretely, we crafted an inverse problem that enables a video diffusion model to be utilized as a prior for many-trajectory optimization of a parametric 3D model (e.g., Gaussian splatting). To seamlessly use the generative prior, we iteratively conduct this process during optimization. Our resulting model can produce high-fidelity virtual driving environments outside the recorded trajectory, enabling free-form trajectory driving simulation. Beyond real driving scenes, DriveX can also be utilized to simulate virtual driving worlds from AI-generated videos.