ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation

📅 2025-12-03

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Existing methods suffer from two key limitations: repair-based paradigms struggle to recover complex visual artifacts, while LiDAR-guided approaches—due to sparse and incomplete point clouds—yield coarse-grained camera control and lack geometric guidance. This paper introduces the first purely vision-based, camera-controllable trajectory video generation framework, leveraging dense, scene-complete 3D Gaussian Splatting (3DGS) as a geometric prior to enable high-fidelity driving video synthesis under arbitrary camera trajectories. We propose a novel two-stage training paradigm and a cross-trajectory 3DGS data construction strategy, enabling large-scale distillation of multi-trajectory supervision signals directly from monocular videos—the first such approach—and introduce the ParaDrive multi-trajectory dataset. Experiments demonstrate state-of-the-art performance in precise camera pose control and structural consistency, significantly improving geometric fidelity and viewpoint controllability of generated videos.

Technology Category

Application Category

📝 Abstract

We propose ReCamDriving, a purely vision-based, camera-controlled novel-trajectory video generation framework. While repair-based methods fail to restore complex artifacts and LiDAR-based approaches rely on sparse and incomplete cues, ReCamDriving leverages dense and scene-complete 3DGS renderings for explicit geometric guidance, achieving precise camera-controllable generation. To mitigate overfitting to restoration behaviors when conditioned on 3DGS renderings, ReCamDriving adopts a two-stage training paradigm: the first stage uses camera poses for coarse control, while the second stage incorporates 3DGS renderings for fine-grained viewpoint and geometric guidance. Furthermore, we present a 3DGS-based cross-trajectory data curation strategy to eliminate the train-test gap in camera transformation patterns, enabling scalable multi-trajectory supervision from monocular videos. Based on this strategy, we construct the ParaDrive dataset, containing over 110K parallel-trajectory video pairs. Extensive experiments demonstrate that ReCamDriving achieves state-of-the-art camera controllability and structural consistency.

Problem

Research questions and friction points this paper is trying to address.

Generates novel trajectory videos using only camera control

Overcomes limitations of repair-based and LiDAR-dependent methods

Ensures precise camera control and structural consistency in generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses dense 3DGS renderings for geometric guidance

Employs two-stage training with coarse and fine control

Introduces cross-trajectory data curation for scalable supervision

🔎 Similar Papers

DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes