🤖 AI Summary
This work addresses the challenging problem of reconstructing sharp, high dynamic range (HDR) 3D scenes from a single blurred low dynamic range (LDR) image captured under extreme lighting conditions. We propose the first method that explicitly models the physical imaging process within a Neural Radiance Fields (NeRF) framework by jointly optimizing pixel-wise RGB and event mapping fields. By fusing data from a single-exposure LDR image and an event camera, our approach directly recovers a physically accurate HDR radiance field. The method enables end-to-end HDR deblurring and novel view synthesis, significantly outperforming existing techniques on both a newly collected dataset and public benchmarks. To the best of our knowledge, this is the first unified HDR NeRF reconstruction pipeline leveraging complementary information from event streams and conventional LDR images.
📝 Abstract
Novel view synthesis from low dynamic range (LDR) blurry images, which are common in the wild, struggles to recover high dynamic range (HDR) and sharp 3D representations in extreme lighting conditions. Although existing methods employ event data to address this issue, they ignore the sensor-physics mismatches between the camera output and physical world radiance, resulting in suboptimal HDR and deblurring results. To cope with this problem, we propose a unified sensor-physics grounded NeRF framework for sharp HDR novel view synthesis from single-exposure blurry LDR images and corresponding events. We employ NeRF to directly represent the actual radiance of the 3D scene in the HDR domain and model raw HDR scene rays hitting the sensor pixels as in the physical world. A pixel-wise RGB mapping field is introduced to align the above rendered pixel values with the sensor-recorded LDR pixel values of the input images. A novel event mapping field is also designed to bridge the physical scene dynamics and actual event sensor output. The two mapping fields are jointly optimized with the NeRF network, leveraging the spatial and temporal dynamic information in events to enhance the sharp HDR 3D representation learning. Experiments on the collected and public datasets demonstrate that our method can achieve state-of-the-art deblurring HDR novel view synthesis results with single-exposure blurry LDR images and corresponding events.