Driving Scene Synthesis on Free-form Trajectories with Generative Prior

📅 2024-12-02
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Existing view synthesis methods for autonomous driving achieve strong interpolation performance on recorded trajectories but exhibit poor generalization to unseen free-form camera paths, hindering closed-loop evaluation of end-to-end driving policies. To address this, we propose the first multi-trajectory joint optimization framework that integrates a video diffusion model as a generative prior, synergistically combining inverse rendering with learned spatiotemporal priors to iteratively reconstruct a 3D scene representation. Our method employs differentiable Gaussian splatting for efficient, high-fidelity rendering and enables temporally consistent novel-view synthesis along arbitrary user-specified trajectories. Experiments demonstrate substantial improvements in geometric consistency and dynamic detail fidelity on unseen trajectories compared to prior approaches. This work establishes a new paradigm for photorealistic driving simulation and AI-driven virtual world construction.

Technology Category

Application Category

📝 Abstract
Driving scene synthesis along free-form trajectories is essential for driving simulations to enable closed-loop evaluation of end-to-end driving policies. While existing methods excel at novel view synthesis on recorded trajectories, they face challenges with novel trajectories due to limited views of driving videos and the vastness of driving environments. To tackle this challenge, we propose a novel free-form driving view synthesis approach, dubbed DriveX, by leveraging video generative prior to optimize a 3D model across a variety of trajectories. Concretely, we crafted an inverse problem that enables a video diffusion model to be utilized as a prior for many-trajectory optimization of a parametric 3D model (e.g., Gaussian splatting). To seamlessly use the generative prior, we iteratively conduct this process during optimization. Our resulting model can produce high-fidelity virtual driving environments outside the recorded trajectory, enabling free-form trajectory driving simulation. Beyond real driving scenes, DriveX can also be utilized to simulate virtual driving worlds from AI-generated videos.
Problem

Research questions and friction points this paper is trying to address.

Synthesizing driving views on novel free-form trajectories
Overcoming limited viewpoints in existing driving video methods
Integrating generative prior with 3D Gaussian models effectively
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distills generative prior into 3D Gaussian model
Uses video diffusion model for rendering refinement
Progressively updates pseudo ground truth for consistency
🔎 Similar Papers
No similar papers found.