CLIP-Guided Unsupervised Semantic-Aware Exposure Correction

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a semantic-aware unsupervised exposure correction network to address key challenges in existing methods, including detail loss, color distortion, and neglect of semantic content. The approach uniquely integrates FastSAM and CLIP: FastSAM extracts region-level semantics, which are fused with image features via a multi-scale residual Spatial Mamba module to recover fine details. Furthermore, a CLIP-guided pseudo-ground-truth generator and a semantic-prompt consistency loss are introduced to enable high-quality correction without real labeled data. Extensive experiments demonstrate that the proposed method significantly outperforms current unsupervised techniques on real-world under/over-exposed images, achieving state-of-the-art results in both quantitative metrics (PSNR, SSIM) and visual quality, while effectively enhancing color fidelity and semantic consistency.

Technology Category

Application Category

📝 Abstract
Improper exposure often leads to severe loss of details, color distortion, and reduced contrast. Exposure correction still faces two critical challenges: (1) the ignorance of object-wise regional semantic information causes the color shift artifacts; (2) real-world exposure images generally have no ground-truth labels, and its labeling entails massive manual editing. To tackle the challenges, we propose a new unsupervised semantic-aware exposure correction network. It contains an adaptive semantic-aware fusion module, which effectively fuses the semantic information extracted from a pre-trained Fast Segment Anything Model into a shared image feature space. Then the fused features are used by our multi-scale residual spatial mamba group to restore the details and adjust the exposure. To avoid manual editing, we propose a pseudo-ground truth generator guided by CLIP, which is fine-tuned to automatically identify exposure situations and instruct the tailored corrections. Also, we leverage the rich priors from the FastSAM and CLIP to develop a semantic-prompt consistency loss to enforce semantic consistency and image-prompt alignment for unsupervised training. Comprehensive experimental results illustrate the effectiveness of our method in correcting real-world exposure images and outperforms state-of-the-art unsupervised methods both numerically and visually.
Problem

Research questions and friction points this paper is trying to address.

exposure correction
semantic-aware
unsupervised learning
color shift
ground-truth labeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic-aware exposure correction
unsupervised learning
CLIP-guided pseudo-ground truth
FastSAM
spatial Mamba
🔎 Similar Papers
No similar papers found.