Driving Scene Synthesis on Free-form Trajectories with Generative Prior

📅 2024-12-02

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Existing view synthesis methods for autonomous driving achieve strong interpolation performance on recorded trajectories but exhibit poor generalization to unseen free-form camera paths, hindering closed-loop evaluation of end-to-end driving policies. To address this, we propose the first multi-trajectory joint optimization framework that integrates a video diffusion model as a generative prior, synergistically combining inverse rendering with learned spatiotemporal priors to iteratively reconstruct a 3D scene representation. Our method employs differentiable Gaussian splatting for efficient, high-fidelity rendering and enables temporally consistent novel-view synthesis along arbitrary user-specified trajectories. Experiments demonstrate substantial improvements in geometric consistency and dynamic detail fidelity on unseen trajectories compared to prior approaches. This work establishes a new paradigm for photorealistic driving simulation and AI-driven virtual world construction.

Technology Category

Application Category

📝 Abstract

Driving scene synthesis along free-form trajectories is essential for driving simulations to enable closed-loop evaluation of end-to-end driving policies. While existing methods excel at novel view synthesis on recorded trajectories, they face challenges with novel trajectories due to limited views of driving videos and the vastness of driving environments. To tackle this challenge, we propose a novel free-form driving view synthesis approach, dubbed DriveX, by leveraging video generative prior to optimize a 3D model across a variety of trajectories. Concretely, we crafted an inverse problem that enables a video diffusion model to be utilized as a prior for many-trajectory optimization of a parametric 3D model (e.g., Gaussian splatting). To seamlessly use the generative prior, we iteratively conduct this process during optimization. Our resulting model can produce high-fidelity virtual driving environments outside the recorded trajectory, enabling free-form trajectory driving simulation. Beyond real driving scenes, DriveX can also be utilized to simulate virtual driving worlds from AI-generated videos.

Problem

Research questions and friction points this paper is trying to address.

Synthesizing driving views on novel free-form trajectories

Overcoming limited viewpoints in existing driving video methods

Integrating generative prior with 3D Gaussian models effectively

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distills generative prior into 3D Gaussian model

Uses video diffusion model for rendering refinement

Progressively updates pseudo ground truth for consistency

🔎 Similar Papers

DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes