LatentHDR: Decoupling Exposure from Diffusion via Conditional Latent-to-Latent Mapping for Text/Image-to-Panoramic HDR

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
Existing generative models struggle to efficiently synthesize high dynamic range (HDR) images, often relying on multiple generation passes of multi-exposure inputs, which incurs substantial computational overhead and structural inconsistencies. This work proposes a novel approach that decouples scene generation from exposure modeling in latent space: leveraging a pre-trained diffusion backbone to generate a coherent scene representation in a single forward pass, followed by a lightweight conditional mapping head that produces a structurally consistent stack of densely sampled HDR exposures. To the best of our knowledge, this is the first method capable of generating high-quality HDR sequences in a single inference step, dramatically improving both efficiency and consistency. Experiments demonstrate that the model achieves state-of-the-art performance in dynamic range and perceptual quality on benchmarks such as SI-HDR, while reducing computational cost by an order of magnitude.
📝 Abstract
High Dynamic Range (HDR) generation remains challenging for generative models, which are largely limited to low dynamic range outputs. Recent diffusionbased approaches approximate HDR by generating multiple exposure-conditioned samples, incurring high computational cost and structural inconsistencies across exposures. We propose LatentHDR, a framework that decouples scene generation from exposure modeling in latent space. A pretrained diffusion backbone produces a single coherent scene representation, while a lightweight conditional latent to-latent head deterministically maps it to exposure-specific representations. This enables the generation of a dense, structurally consistent exposure stack in a single pass. This design eliminates multi-pass diffusion, ensures cross-exposure alignment, and enables scalable HDR synthesis. LatentHDR supports both textand image-conditioned HDR generation for perspective and panoramic scenes. Experiments on synthetic data and the SI-HDR benchmark show that LatentHDR achieves state-of-the-art dynamic range with competitive perceptual quality, while reducing computation by an order of magnitude. Our results demonstrate that high-quality HDR generation can be achieved through structured latent modeling, challenging the need for stochastic multi-exposure generation.
Problem

Research questions and friction points this paper is trying to address.

HDR generation
diffusion models
exposure consistency
computational cost
latent space
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent-to-Latent Mapping
Exposure Decoupling
HDR Generation
Diffusion Models
Structural Consistency
🔎 Similar Papers