LumiVideo: An Intelligent Agentic System for Video Color Grading

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing automatic color grading methods, which often lack interpretability and the iterative control required by professionals, struggling to balance cinematic aesthetics with temporal consistency. To bridge this gap, the paper introduces an embodied agent paradigm into video color grading for the first time, constructing a system that emulates the cognitive workflow of expert colorists through four stages: perception, reasoning, execution, and reflection. By integrating large language models (LLMs), retrieval-augmented generation (RAG), and tree-of-thought (ToT) search, the system automatically generates cinematic-grade base looks within a nonlinear color parameter space and supports interactive refinement via natural language feedback. Outputs in ASC-CDL and 3D LUT formats ensure temporal coherence, achieving near-expert quality in fully automatic mode. The authors also release LumiGrade, the first benchmark for log video color grading evaluation.
📝 Abstract
Video color grading is a critical post-production process that transforms flat, log-encoded raw footage into emotionally resonant cinematic visuals. Existing automated methods act as static, black-box executors that directly output edited pixels, lacking both interpretability and the iterative control required by professionals. We introduce LumiVideo, an agentic system that mimics the cognitive workflow of professional colorists through four stages: Perception, Reasoning, Execution, and Reflection. Given only raw log video, LumiVideo autonomously produces a cinematic base grade by analyzing the scene's physical lighting and semantic content. Its Reasoning engine synergizes an LLM's internalized cinematic knowledge with a Retrieval-Augmented Generation (RAG) framework via a Tree of Thoughts (ToT) search to navigate the non-linear color parameter space. Rather than generating pixels, the system compiles the deduced parameters into industry-standard ASC-CDL configurations and a globally consistent 3D LUT, analytically guaranteeing temporal consistency. An optional Reflection loop then allows creators to refine the result via natural language feedback. We further introduce LumiGrade, the first log-encoded video benchmark for evaluating automated grading. Experiments show that LumiVideo approaches human expert quality in fully automatic mode while enabling precise iterative control when directed.
Problem

Research questions and friction points this paper is trying to address.

video color grading
interpretability
iterative control
temporal consistency
cinematic visuals
Innovation

Methods, ideas, or system contributions that make the work stand out.

agentic system
video color grading
Retrieval-Augmented Generation
Tree of Thoughts
ASC-CDL
🔎 Similar Papers
No similar papers found.