Training Neural Networks on RAW and HDR Images for Restoration Tasks

📅 2023-12-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the limited effectiveness of linear color space modeling in RAW/HDR image restoration (denoising, deblurring, super-resolution). We systematically validate and propose a novel training paradigm that replaces conventional linear color spaces with display-encoded gamuts—specifically PQ, PU21, and mu-law—to better align neural network optimization with human visual perception. Our key contribution is the first empirical demonstration that perceptually uniform display encodings significantly accelerate model convergence and improve reconstruction quality. By integrating standard CNN architectures with perceptually calibrated loss functions, we achieve end-to-end optimization. Across denoising, deblurring, and super-resolution tasks, our approach yields consistent PSNR gains of 2–9 dB. Crucially, it preserves physical interpretability of RAW/HDR data while better satisfying both perceptual fidelity and deep learning optimization requirements. This establishes a generalizable, perception-aware training framework for high-dynamic-range image restoration.

📝 Abstract

The vast majority of standard image and video content available online is represented in display-encoded color spaces, in which pixel values are conveniently scaled to a limited range (0-1) and the color distribution is approximately perceptually uniform. In contrast, both camera RAW and high dynamic range (HDR) images are often represented in linear color spaces, in which color values are linearly related to colorimetric quantities of light. While training on commonly available display-encoded images is a well-established practice, there is no consensus on how neural networks should be trained for tasks on RAW and HDR images in linear color spaces. In this work, we test several approaches on three popular image restoration applications: denoising, deblurring, and single-image super-resolution. We examine whether HDR/RAW images need to be display-encoded using popular transfer functions (PQ, PU21, and mu-law), or whether it is better to train in linear color spaces, but use loss functions that correct for perceptual non-uniformity. Our results indicate that neural networks train significantly better on HDR and RAW images represented in display-encoded color spaces, which offer better perceptual uniformity than linear spaces. This small change to the training strategy can bring a very substantial gain in performance, between 2 and 9 dB.

Problem

Research questions and friction points this paper is trying to address.

Optimizing neural network training for RAW and HDR image restoration.

Comparing display-encoded vs. linear color spaces for training.

Evaluating perceptual uniformity impact on denoising, deblurring, super-resolution.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training on display-encoded HDR/RAW images

Using perceptual loss functions for uniformity

Substantial performance gain with small change

🔎 Similar Papers

No similar papers found.