Spatiotemporally Consistent Indoor Lighting Estimation with Diffusion Priors

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Indoor illumination estimation—recovering spatiotemporally continuous light fields from single images or videos—faces severe ill-posedness and poor zero-shot generalization to in-the-wild settings. To address this, we propose a diffusion-prior-driven MLP-based light field modeling framework: a fine-tuned pre-trained 2D diffusion model serves as a strong illumination prior; multi-chrome spherical probes enable multi-view illumination observation and occluded-region completion; and temporal consistency constraints are imposed to regularize video-based illumination estimation. Integrating insights from neural radiance fields with continuous light field representations, our method achieves the first high-fidelity, spatiotemporally coherent dynamic illumination reconstruction on in-the-wild videos—without scene-specific training. Extensive experiments demonstrate significant improvements over state-of-the-art baselines on both single-image and video illumination estimation tasks.

Technology Category

Application Category

📝 Abstract
Indoor lighting estimation from a single image or video remains a challenge due to its highly ill-posed nature, especially when the lighting condition of the scene varies spatially and temporally. We propose a method that estimates from an input video a continuous light field describing the spatiotemporally varying lighting of the scene. We leverage 2D diffusion priors for optimizing such light field represented as a MLP. To enable zero-shot generalization to in-the-wild scenes, we fine-tune a pre-trained image diffusion model to predict lighting at multiple locations by jointly inpainting multiple chrome balls as light probes. We evaluate our method on indoor lighting estimation from a single image or video and show superior performance over compared baselines. Most importantly, we highlight results on spatiotemporally consistent lighting estimation from in-the-wild videos, which is rarely demonstrated in previous works.
Problem

Research questions and friction points this paper is trying to address.

Estimating spatially and temporally varying indoor lighting from videos
Leveraging diffusion priors for continuous light field optimization
Achieving zero-shot generalization to in-the-wild scenes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 2D diffusion priors for light field optimization
Fine-tunes pre-trained diffusion model for multi-location lighting
Estimates spatiotemporally consistent lighting from videos
🔎 Similar Papers
No similar papers found.