Under One Sun: Multi-Object Generative Perception of Materials and Illumination

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This work addresses the inherent ambiguity in disentangling reflectance, texture, and illumination from a single image. The authors propose a multi-object generative inverse rendering method that leverages the prior that all objects in a scene share a common illumination. By employing a diffusion model, the approach jointly recovers per-object reflectance and texture while estimating the shared lighting. Key innovations include a cascaded end-to-end architecture, a coordination-guidance mechanism, an axial attention module, and a texture-extraction ControlNet, which together enable joint disentanglement in both image and angular spaces. This design effectively enforces illumination consistency across multiple objects while preserving high-frequency details. Experiments demonstrate that, given known geometry and a single input image containing multiple objects, the method significantly improves the accuracy and visual quality of material and lighting decomposition.

Technology Category

Application Category

📝 Abstract

We introduce Multi-Object Generative Perception (MultiGP), a generative inverse rendering method for stochastic sampling of all radiometric constituents -- reflectance, texture, and illumination -- underlying object appearance from a single image. Our key idea to solve this inherently ambiguous radiometric disentanglement is to leverage the fact that while their texture and reflectance may differ, objects in the same scene are all lit by the same illumination. MultiGP exploits this consensus to produce samples of reflectance, texture, and illumination from a single image of known shapes based on four key technical contributions: a cascaded end-to-end architecture that combines image-space and angular-space disentanglement; Coordinated Guidance for diffusion convergence to a single consistent illumination estimate; Axial Attention applied to facilitate ``cross-talk'' between objects of different reflectance; and a Texture Extraction ControlNet to preserve high-frequency texture details while ensuring decoupling from estimated lighting. Experimental results demonstrate that MultiGP effectively leverages the complementary spatial and frequency characteristics of multiple object appearances to recover individual texture and reflectance as well as the common illumination.

Problem

Research questions and friction points this paper is trying to address.

inverse rendering

material disentanglement

illumination estimation

single-image perception

radiometric decomposition

Innovation

Methods, ideas, or system contributions that make the work stand out.

generative inverse rendering

radiometric disentanglement

shared illumination