DeepSPG: Exploring Deep Semantic Prior Guidance for Low-light Image Enhancement with Multimodal Learning

📅 2025-04-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the failure of low-light image enhancement (LLIE) under extremely dark conditions—where severe semantic information loss degrades restoration fidelity—this paper proposes a Retinex decomposition-based framework guided by deep semantic priors. We introduce, for the first time in LLIE, dual-path multimodal semantic priors: image-level segmentation priors from SegFormer and text-level semantic alignment via CLIP. These priors jointly construct a multi-scale semantic-aware architecture that enables synergistic modeling of Retinex decomposition (illumination and reflectance) and semantic guidance. The method achieves state-of-the-art performance across five mainstream benchmarks, notably improving structural fidelity and semantic consistency in severely underexposed regions. Code is publicly available.

Technology Category

Application Category

📝 Abstract
There has long been a belief that high-level semantics learning can benefit various downstream computer vision tasks. However, in the low-light image enhancement (LLIE) community, existing methods learn a brutal mapping between low-light and normal-light domains without considering the semantic information of different regions, especially in those extremely dark regions that suffer from severe information loss. To address this issue, we propose a new deep semantic prior-guided framework (DeepSPG) based on Retinex image decomposition for LLIE to explore informative semantic knowledge via a pre-trained semantic segmentation model and multimodal learning. Notably, we incorporate both image-level semantic prior and text-level semantic prior and thus formulate a multimodal learning framework with combinatorial deep semantic prior guidance for LLIE. Specifically, we incorporate semantic knowledge to guide the enhancement process via three designs: an image-level semantic prior guidance by leveraging hierarchical semantic features from a pre-trained semantic segmentation model; a text-level semantic prior guidance by integrating natural language semantic constraints via a pre-trained vision-language model; a multi-scale semantic-aware structure that facilitates effective semantic feature incorporation. Eventually, our proposed DeepSPG demonstrates superior performance compared to state-of-the-art methods across five benchmark datasets. The implementation details and code are publicly available at https://github.com/Wenyuzhy/DeepSPG.
Problem

Research questions and friction points this paper is trying to address.

Addresses semantic ignorance in low-light image enhancement
Integrates image and text-level semantic prior guidance
Improves enhancement via multimodal learning and Retinex decomposition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained semantic segmentation model
Integrates vision-language model constraints
Multi-scale semantic-aware structure design
🔎 Similar Papers
No similar papers found.
J
Jialang Lu
School of Cyber Science and Technology, Hubei University, Wuhan, China
H
Huayu Zhao
Department of Electrical Automation Design, Beijing Shougang International Engineering Technology, Beijing, China
Huiyu Zhai
Huiyu Zhai
UESTC
Computer Vision
X
Xingxing Yang
Department of Computer Science, Hong Kong Baptist University, Hong Kong SAR, China
S
Shini Han
School of Computer Science and Technology, Harbin University of Science and Technology, Heilongjiang, China