Low-Light Video Enhancement with An Effective Spatial-Temporal Decomposition Paradigm

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes VLLVE, a view-aware low-light video enhancement framework designed to address the severe visual degradation caused by extreme underexposure and noise in low-light videos. By decoupling view-invariant (intrinsic appearance) and view-dependent (illumination) components, VLLVE introduces a spatial-temporal decomposition paradigm through a dual-structured encoder-decoder architecture with additive residual blocks. The framework jointly optimizes enhancement and degradation processes by incorporating cross-frame correspondence modeling, scene continuity constraints, and an end-to-end bidirectional learning mechanism. Extensive experiments on standard benchmarks demonstrate that VLLVE achieves superior robustness and visual quality in real-world complex dynamic scenes.

Technology Category

Application Category

📝 Abstract
Low-Light Video Enhancement (LLVE) seeks to restore dynamic or static scenes plagued by severe invisibility and noise. In this paper, we present an innovative video decomposition strategy that incorporates view-independent and view-dependent components to enhance the performance of LLVE. The framework is called View-aware Low-light Video Enhancement (VLLVE). We leverage dynamic cross-frame correspondences for the view-independent term (which primarily captures intrinsic appearance) and impose a scene-level continuity constraint on the view-dependent term (which mainly describes the shading condition) to achieve consistent and satisfactory decomposition results. To further ensure consistent decomposition, we introduce a dual-structure enhancement network featuring a cross-frame interaction mechanism. By supervising different frames simultaneously, this network encourages them to exhibit matching decomposition features. This mechanism can seamlessly integrate with encoder-decoder single-frame networks, incurring minimal additional parameter costs. Building upon VLLVE, we propose a more comprehensive decomposition strategy by introducing an additive residual term, resulting in VLLVE++. This residual term can simulate scene-adaptive degradations, which are difficult to model using a decomposition formulation for common scenes, thereby further enhancing the ability to capture the overall content of videos. In addition, VLLVE++ enables bidirectional learning for both enhancement and degradation-aware correspondence refinement (end-to-end manner), effectively increasing reliable correspondences while filtering out incorrect ones. Notably, VLLVE++ demonstrates strong capability in handling challenging cases, such as real-world scenes and videos with high dynamics. Extensive experiments are conducted on widely recognized LLVE benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Low-Light Video Enhancement
video decomposition
noise
invisibility
scene degradation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatial-Temporal Decomposition
View-aware Enhancement
Cross-frame Correspondence
Additive Residual Modeling
Bidirectional Learning