ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding

📅 2025-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses three key challenges in dynamic video re-rendering: (1) difficulty in spatiotemporal alignment, (2) misapplication of RoPE (Rotary Position Embedding) for camera-conditioned modeling, and (3) poor generalization to variable-length videos. To this end, we propose Rotary Camera Encoding (RoCE), a novel camera-conditioned positional encoding mechanism. RoCE uniquely incorporates camera pose parameters into the phase shift of RoPE, enabling robust modeling of out-of-distribution camera trajectories and arbitrarily long videos. By explicitly encoding multi-view geometric relationships between input and target videos, RoCE significantly improves dynamic object localization accuracy and background consistency. Integrated into Transformer-based architectures, RoCE ensures spatiotemporally coherent generation. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art approaches across diverse camera motions and video lengths, achieving new SOTA performance in camera controllability, geometric consistency, and visual fidelity.

Technology Category

Application Category

📝 Abstract
We present ReDirector, a novel camera-controlled video retake generation method for dynamically captured variable-length videos. In particular, we rectify a common misuse of RoPE in previous works by aligning the spatiotemporal positions of the input video and the target retake. Moreover, we introduce Rotary Camera Encoding (RoCE), a camera-conditioned RoPE phase shift that captures and integrates multi-view relationships within and across the input and target videos. By integrating camera conditions into RoPE, our method generalizes to out-of-distribution camera trajectories and video lengths, yielding improved dynamic object localization and static background preservation. Extensive experiments further demonstrate significant improvements in camera controllability, geometric consistency, and video quality across various trajectories and lengths.
Problem

Research questions and friction points this paper is trying to address.

Generating variable-length video retakes with camera control
Correcting RoPE misuse to align spatiotemporal video positions
Generalizing camera trajectories for improved object localization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rotary Camera Encoding integrates camera conditions into RoPE
Aligns spatiotemporal positions between input and target videos
Generalizes to out-of-distribution camera trajectories and lengths
🔎 Similar Papers
No similar papers found.