TC-Light: Temporally Consistent Relighting for Dynamic Long Videos

📅 2025-06-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing video relighting methods are predominantly limited to portrait scenarios and struggle to simultaneously ensure temporal consistency and computational efficiency for long-duration dynamic videos. To address this challenge, we propose a two-stage post-optimization framework for high-quality, efficient lighting editing under complex motion. Our key contributions are: (1) appearance embedding optimization to enhance temporal coherence of texture and lighting details; and (2) the construction of a Unique Video Tensor (UVT) as a canonical video representation, which improves physical plausibility and compactness through global illumination alignment. Starting from a dilated video relighting model as initialization, our method jointly optimizes appearance embeddings and the UVT in two sequential stages. Evaluated on a newly established long-duration dynamic video benchmark, our approach significantly improves temporal consistency, reduces inference overhead, and enables photorealistic relighting for complex dynamic scenes.

Technology Category

Application Category

📝 Abstract
Editing illumination in long videos with complex dynamics has significant value in various downstream tasks, including visual content creation and manipulation, as well as data scaling up for embodied AI through sim2real and real2real transfer. Nevertheless, existing video relighting techniques are predominantly limited to portrait videos or fall into the bottleneck of temporal consistency and computation efficiency. In this paper, we propose TC-Light, a novel paradigm characterized by the proposed two-stage post optimization mechanism. Starting from the video preliminarily relighted by an inflated video relighting model, it optimizes appearance embedding in the first stage to align global illumination. Then it optimizes the proposed canonical video representation, i.e., Unique Video Tensor (UVT), to align fine-grained texture and lighting in the second stage. To comprehensively evaluate performance, we also establish a long and highly dynamic video benchmark. Extensive experiments show that our method enables physically plausible relighting results with superior temporal coherence and low computation cost. The code and video demos are available at https://dekuliutesla.github.io/tclight/.
Problem

Research questions and friction points this paper is trying to address.

Editing illumination in dynamic long videos efficiently
Ensuring temporal consistency in video relighting results
Overcoming limitations of existing portrait-focused relighting techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage post optimization mechanism
Unique Video Tensor (UVT) representation
Inflated video relighting model
🔎 Similar Papers
No similar papers found.
Y
Yang Liu
NLPR, MAIS, Institute of Automation, Chinese Academy of Sciences
Chuanchen Luo
Chuanchen Luo
Shandong University
3D VisionGenerative AISpatial IntelligenceHuman-Centric Perception
Z
Zimo Tang
Huazhong University of Science and Technology
Yingyan Li
Yingyan Li
Institute of Automation, Chinese Academy of Sciences
computer vision
Y
Yuran Yang
Tencent
Y
Yuanyong Ning
Tencent
L
Lue Fan
NLPR, MAIS, Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Junran Peng
Junran Peng
Assosiate Professor of USTB
3D AIGC3D Comprehension and ReconstructionEmbodied AI
Zhaoxiang Zhang
Zhaoxiang Zhang
Institute of Automation, Chinese Academy of Sciences
Computer VisionPattern RecognitionBiologically-inspired Learning