CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing camera-conditioned video generation models suffer from diminished motion fidelity and limited viewpoint diversity under large-scale camera motions. This paper proposes a camera-controllable video diffusion framework that introduces a novel two-stage dynamic enhancement paradigm—first improving motion fidelity within individual video segments, then enabling seamless exploration across wide-field-of-view trajectories—alongside a lightweight camera parameter injection module. Crucially, this design preserves the pre-trained model’s inherent motion dynamics while enabling precise, flexible camera trajectory control. By integrating a dynamically enhanced training dataset and an iterative trajectory-guided generation strategy, our method significantly improves spatial exploration range, motion fidelity, temporal consistency, and viewpoint diversity across diverse complex dynamic scenes. It supports interactive, user-driven synthesis of long-range, spatiotemporally coherent exploration videos.

Technology Category

Application Category

📝 Abstract
This paper introduces CameraCtrl II, a framework that enables large-scale dynamic scene exploration through a camera-controlled video diffusion model. Previous camera-conditioned video generative models suffer from diminished video dynamics and limited range of viewpoints when generating videos with large camera movement. We take an approach that progressively expands the generation of dynamic scenes -- first enhancing dynamic content within individual video clip, then extending this capability to create seamless explorations across broad viewpoint ranges. Specifically, we construct a dataset featuring a large degree of dynamics with camera parameter annotations for training while designing a lightweight camera injection module and training scheme to preserve dynamics of the pretrained models. Building on these improved single-clip techniques, we enable extended scene exploration by allowing users to iteratively specify camera trajectories for generating coherent video sequences. Experiments across diverse scenarios demonstrate that CameraCtrl Ii enables camera-controlled dynamic scene synthesis with substantially wider spatial exploration than previous approaches.
Problem

Research questions and friction points this paper is trying to address.

Enhances dynamic content in video clips
Extends seamless exploration across wide viewpoints
Enables camera-controlled dynamic scene synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive dynamic scene generation enhancement
Lightweight camera injection module design
Iterative camera trajectory specification for video
🔎 Similar Papers
No similar papers found.