🤖 AI Summary
Existing methods suffer from two key limitations: repair-based paradigms struggle to recover complex visual artifacts, while LiDAR-guided approaches—due to sparse and incomplete point clouds—yield coarse-grained camera control and lack geometric guidance. This paper introduces the first purely vision-based, camera-controllable trajectory video generation framework, leveraging dense, scene-complete 3D Gaussian Splatting (3DGS) as a geometric prior to enable high-fidelity driving video synthesis under arbitrary camera trajectories. We propose a novel two-stage training paradigm and a cross-trajectory 3DGS data construction strategy, enabling large-scale distillation of multi-trajectory supervision signals directly from monocular videos—the first such approach—and introduce the ParaDrive multi-trajectory dataset. Experiments demonstrate state-of-the-art performance in precise camera pose control and structural consistency, significantly improving geometric fidelity and viewpoint controllability of generated videos.
📝 Abstract
We propose ReCamDriving, a purely vision-based, camera-controlled novel-trajectory video generation framework. While repair-based methods fail to restore complex artifacts and LiDAR-based approaches rely on sparse and incomplete cues, ReCamDriving leverages dense and scene-complete 3DGS renderings for explicit geometric guidance, achieving precise camera-controllable generation. To mitigate overfitting to restoration behaviors when conditioned on 3DGS renderings, ReCamDriving adopts a two-stage training paradigm: the first stage uses camera poses for coarse control, while the second stage incorporates 3DGS renderings for fine-grained viewpoint and geometric guidance. Furthermore, we present a 3DGS-based cross-trajectory data curation strategy to eliminate the train-test gap in camera transformation patterns, enabling scalable multi-trajectory supervision from monocular videos. Based on this strategy, we construct the ParaDrive dataset, containing over 110K parallel-trajectory video pairs. Extensive experiments demonstrate that ReCamDriving achieves state-of-the-art camera controllability and structural consistency.