DeepSPG: Exploring Deep Semantic Prior Guidance for Low-light Image Enhancement with Multimodal Learning

📅 2025-04-27

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

To address the failure of low-light image enhancement (LLIE) under extremely dark conditions—where severe semantic information loss degrades restoration fidelity—this paper proposes a Retinex decomposition-based framework guided by deep semantic priors. We introduce, for the first time in LLIE, dual-path multimodal semantic priors: image-level segmentation priors from SegFormer and text-level semantic alignment via CLIP. These priors jointly construct a multi-scale semantic-aware architecture that enables synergistic modeling of Retinex decomposition (illumination and reflectance) and semantic guidance. The method achieves state-of-the-art performance across five mainstream benchmarks, notably improving structural fidelity and semantic consistency in severely underexposed regions. Code is publicly available.

Technology Category

Application Category

📝 Abstract

There has long been a belief that high-level semantics learning can benefit various downstream computer vision tasks. However, in the low-light image enhancement (LLIE) community, existing methods learn a brutal mapping between low-light and normal-light domains without considering the semantic information of different regions, especially in those extremely dark regions that suffer from severe information loss. To address this issue, we propose a new deep semantic prior-guided framework (DeepSPG) based on Retinex image decomposition for LLIE to explore informative semantic knowledge via a pre-trained semantic segmentation model and multimodal learning. Notably, we incorporate both image-level semantic prior and text-level semantic prior and thus formulate a multimodal learning framework with combinatorial deep semantic prior guidance for LLIE. Specifically, we incorporate semantic knowledge to guide the enhancement process via three designs: an image-level semantic prior guidance by leveraging hierarchical semantic features from a pre-trained semantic segmentation model; a text-level semantic prior guidance by integrating natural language semantic constraints via a pre-trained vision-language model; a multi-scale semantic-aware structure that facilitates effective semantic feature incorporation. Eventually, our proposed DeepSPG demonstrates superior performance compared to state-of-the-art methods across five benchmark datasets. The implementation details and code are publicly available at https://github.com/Wenyuzhy/DeepSPG.

Problem

Research questions and friction points this paper is trying to address.

Addresses semantic ignorance in low-light image enhancement

Integrates image and text-level semantic prior guidance

Improves enhancement via multimodal learning and Retinex decomposition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained semantic segmentation model

Integrates vision-language model constraints

Multi-scale semantic-aware structure design

🔎 Similar Papers

ALEN: A Dual-Approach for Uniform and Non-Uniform Low-Light Image Enhancement