GMODiff: One-Step Gain Map Refinement with Diffusion Priors for HDR Reconstruction

📅 2025-12-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address three key challenges in multi-exposure HDR reconstruction using pretrained latent diffusion models (LDMs)—limited dynamic range, high inference overhead, and content hallucination—this paper proposes a gain-map (GM)-driven single-step diffusion framework. Instead of generating full HDR images, our method focuses on estimating the GM as the primary task, leveraging LDMs’ latent-space priors for efficient, single-step reconstruction. We introduce a novel integration of regression-guided denoising and latent-code decoding, jointly optimizing structural fidelity and perceptual quality while effectively suppressing hallucinations. Experiments demonstrate that our approach achieves state-of-the-art performance across multiple HDR reconstruction metrics. Moreover, it accelerates inference by 100× compared to existing LDM-based methods, significantly reducing computational cost without sacrificing quality.

Technology Category

Application Category

📝 Abstract
Pre-trained Latent Diffusion Models (LDMs) have recently shown strong perceptual priors for low-level vision tasks, making them a promising direction for multi-exposure High Dynamic Range (HDR) reconstruction. However, directly applying LDMs to HDR remains challenging due to: (1) limited dynamic-range representation caused by 8-bit latent compression, (2) high inference cost from multi-step denoising, and (3) content hallucination inherent to generative nature. To address these challenges, we introduce GMODiff, a gain map-driven one-step diffusion framework for multi-exposure HDR reconstruction. Instead of reconstructing full HDR content, we reformulate HDR reconstruction as a conditionally guided Gain Map (GM) estimation task, where the GM encodes the extended dynamic range while retaining the same bit depth as LDR images. We initialize the denoising process from an informative regression-based estimate rather than pure noise, enabling the model to generate high-quality GMs in a single denoising step. Furthermore, recognizing that regression-based models excel in content fidelity while LDMs favor perceptual quality, we leverage regression priors to guide both the denoising process and latent decoding of the LDM, suppressing hallucinations while preserving structural accuracy. Extensive experiments demonstrate that our GMODiff performs favorably against several state-of-the-art methods and is 100 faster than previous LDM-based methods.
Problem

Research questions and friction points this paper is trying to address.

Overcoming limitations of Latent Diffusion Models for HDR reconstruction.
Reducing high inference cost from multi-step denoising in HDR tasks.
Mitigating content hallucination while preserving structural accuracy in HDR.
Innovation

Methods, ideas, or system contributions that make the work stand out.

One-step diffusion framework using gain maps
Regression-based initialization for single denoising step
Guided denoising with regression priors to reduce hallucinations
🔎 Similar Papers
No similar papers found.
T
Tao Hu
Northwestern Polytechnical University
W
Weiyu Zhou
Northwestern Polytechnical University
Y
Yanjie Tu
Northwestern Polytechnical University
P
Peng Wu
Northwestern Polytechnical University
W
Wei Dong
Xi’an University of Architecture and Technology
Qingsen Yan
Qingsen Yan
Northwestern Polytechnical University
Image processingImage fusionContinual learning
Yanning Zhang
Yanning Zhang
Northwestern Polytechnical University
Computer Vision