CamCloneMaster: Enabling Reference-based Camera Control for Video Generation

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

This work addresses the lack of intuitive camera motion control in video generation—where existing methods rely on manually specified camera parameters or test-time optimization. We propose a reference-based camera cloning framework that requires neither explicit camera parameters nor test-time fine-tuning. Methodologically, we introduce the first large-scale synthetic Camera Clone Dataset and design a diffusion-based cross-video motion transfer architecture that disentangles camera motion from scene content in latent space, trained end-to-end via synthetic data supervision. Experiments demonstrate state-of-the-art performance in both camera controllability and visual fidelity. A user study confirms significant improvements in usability and motion expressiveness. Our core contribution is the first parameter-free, purely reference-video-driven camera motion cloning paradigm, establishing a new standard for intuitive and controllable video generation.

Technology Category

Application Category

📝 Abstract

Camera control is crucial for generating expressive and cinematic videos. Existing methods rely on explicit sequences of camera parameters as control conditions, which can be cumbersome for users to construct, particularly for intricate camera movements. To provide a more intuitive camera control method, we propose CamCloneMaster, a framework that enables users to replicate camera movements from reference videos without requiring camera parameters or test-time fine-tuning. CamCloneMaster seamlessly supports reference-based camera control for both Image-to-Video and Video-to-Video tasks within a unified framework. Furthermore, we present the Camera Clone Dataset, a large-scale synthetic dataset designed for camera clone learning, encompassing diverse scenes, subjects, and camera movements. Extensive experiments and user studies demonstrate that CamCloneMaster outperforms existing methods in terms of both camera controllability and visual quality.

Problem

Research questions and friction points this paper is trying to address.

Enables intuitive camera control without parameter sequences

Replicates camera movements from reference videos directly

Supports unified video generation for diverse tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Replicates camera movements from reference videos

Supports Image-to-Video and Video-to-Video tasks

Uses large-scale synthetic dataset for learning

🔎 Similar Papers

CameraCtrl: Enabling Camera Control for Text-to-Video Generation

2024-04-02arXiv.orgCitations: 64