🤖 AI Summary
Existing diffusion models only support 8-bit LDR inputs, limiting their ability to generate physically plausible, detail-rich HDR images—thus hindering deployment in graphics applications requiring linear HDR representations. This paper proposes the first latent-space exposure fusion–based diffusion framework for HDR image generation: it adapts image-level multi-exposure fusion techniques to the latent space of pretrained diffusion models, enabling efficient fine-tuning using only a small amount of HDR data. We introduce an HDR-guided loss and a linear-domain reconstruction constraint to achieve unpaired LDR-to-HDR translation without paired supervision. Our method effectively recovers fine details in overexposed and underexposed regions. It achieves state-of-the-art performance on HDR environment map generation and depth-of-field simulation, with significantly improved dynamic range and visual realism.
📝 Abstract
While consumer displays increasingly support more than 10 stops of dynamic range, most image assets such as internet photographs and generative AI content remain limited to 8-bit low dynamic range (LDR), constraining their utility across high dynamic range (HDR) applications. Currently, no generative model can produce high-bit, high-dynamic range content in a generalizable way. Existing LDR-to-HDR conversion methods often struggle to produce photorealistic details and physically-plausible dynamic range in the clipped areas. We introduce LEDiff, a method that enables a generative model with HDR content generation through latent space fusion inspired by image-space exposure fusion techniques. It also functions as an LDR-to-HDR converter, expanding the dynamic range of existing low-dynamic range images. Our approach uses a small HDR dataset to enable a pretrained diffusion model to recover detail and dynamic range in clipped highlights and shadows. LEDiff brings HDR capabilities to existing generative models and converts any LDR image to HDR, creating photorealistic HDR outputs for image generation, image-based lighting (HDR environment map generation), and photographic effects such as depth of field simulation, where linear HDR data is essential for realistic quality.