🤖 AI Summary
Existing 3D scene relighting methods struggle to simultaneously achieve high quality and controllability in complex, real-world room-scale environments, often hindered by the ill-posed nature of inverse rendering or limited to 2D images and single objects. This work proposes a novel paradigm that, for the first time, integrates a video-to-video diffusion-based relighting model with 3D reconstruction. By distilling the relit video outputs into a 3D radiance field, the method circumvents direct inverse rendering while enabling high-fidelity, multi-view-consistent, and user-controllable relighting at room scale. Extensive experiments on both synthetic and real-world datasets demonstrate its effectiveness, significantly enhancing photorealism when rendering complex indoor scenes under novel lighting conditions.
📝 Abstract
We present a method for relighting 3D reconstructions of large room-scale environments. Existing solutions for 3D scene relighting often require solving under-determined or ill-conditioned inverse rendering problems, and are as such unable to produce high-quality results on complex real-world scenes. Though recent progress in using generative image and video diffusion models for relighting has been promising, these techniques are either limited to 2D image and video relighting or 3D relighting of individual objects. Our approach enables controllable 3D relighting of room-scale scenes by distilling the outputs of a video-to-video relighting diffusion model into a 3D reconstruction. This side-steps the need to solve a difficult inverse rendering problem, and results in a flexible system that can relight 3D reconstructions of complex real-world scenes. We validate our approach on both synthetic and real-world datasets to show that it can faithfully render novel views of scenes under new lighting conditions.