TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models

📅 2025-03-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Monocular video camera trajectory retargeting suffers from insufficient precision and controllability. Method: We propose the first dual-stream conditional video diffusion model that decouples deterministic viewpoint transformation from stochastic content generation, enabling user-specified arbitrary 4D camera paths. To train it, we design a dual reprojection strategy to construct hybrid training data—integrating web-scale monocular videos with static multi-view datasets—and introduce point-cloud rendering guidance jointly conditioned on source video to ensure spatiotemporal consistency in novel-view synthesis. Contribution/Results: Our method generalizes to diverse scenes without requiring multi-view input. Experiments demonstrate significant improvements over state-of-the-art approaches on both multi-view benchmarks and large-scale monocular video datasets, achieving high-fidelity 4D content generation with precise, user-controllable camera trajectory modeling.

Technology Category

Application Category

📝 Abstract

We present TrajectoryCrafter, a novel approach to redirect camera trajectories for monocular videos. By disentangling deterministic view transformations from stochastic content generation, our method achieves precise control over user-specified camera trajectories. We propose a novel dual-stream conditional video diffusion model that concurrently integrates point cloud renders and source videos as conditions, ensuring accurate view transformations and coherent 4D content generation. Instead of leveraging scarce multi-view videos, we curate a hybrid training dataset combining web-scale monocular videos with static multi-view datasets, by our innovative double-reprojection strategy, significantly fostering robust generalization across diverse scenes. Extensive evaluations on multi-view and large-scale monocular videos demonstrate the superior performance of our method.

Problem

Research questions and friction points this paper is trying to address.

Redirecting camera trajectories in monocular videos

Precise control over user-specified camera trajectories

Robust generalization across diverse scenes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-stream conditional video diffusion model

Hybrid training dataset with double-reprojection

Precise control over camera trajectories

🔎 Similar Papers

VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control

2024-07-17arXiv.orgCitations: 28

Authors to Follow