GeoRelight: Learning Joint Geometrical Relighting and Reconstruction with Flexible Multi-Modal Diffusion Transformers

📅 2026-04-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

182K/year
🤖 AI Summary
Single-image portrait relighting is highly ill-posed due to the entanglement of geometry, material, and illumination in 2D imagery, and existing approaches often suffer from physical inconsistency caused by error accumulation or insufficient geometric constraints. This work proposes a unified multimodal diffusion Transformer that jointly optimizes relighting and 3D geometry reconstruction. The method introduces the distortion-free iNOD 3D representation, an automatic annotation strategy combining real and synthetic data, and a latent diffusion mechanism to enable end-to-end collaborative modeling of geometry and lighting. Experiments demonstrate that the proposed approach outperforms existing sequential or geometry-agnostic methods on both relighting and geometry reconstruction tasks, significantly enhancing physical consistency and visual quality.

Technology Category

Application Category

📝 Abstract
Relighting a person from a single photo is an attractive but ill-posed task, as a 2D image ambiguously entangles 3D geometry, intrinsic appearance, and illumination. Current methods either use sequential pipelines that suffer from error accumulation, or they do not explicitly leverage 3D geometry during relighting, which limits physical consistency. Since relighting and estimation of 3D geometry are mutually beneficial tasks, we propose a unified Multi-Modal Diffusion Transformer (DiT) that jointly solves for both: GeoRelight. We make this possible through two key technical contributions: isotropic NDC-Orthographic Depth (iNOD), a distortion-free 3D representation compatible with latent diffusion models; and a strategic mixed-data training method that combines synthetic and auto-labeled real data. By solving geometry and relighting jointly, GeoRelight achieves better performance than both sequential models and previous systems that ignored geometry.
Problem

Research questions and friction points this paper is trying to address.

relighting
3D geometry
single-image
physical consistency
ill-posed problem
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Modal Diffusion Transformer
Joint Relighting and Reconstruction
iNOD
Geometric Consistency
Mixed-Data Training
🔎 Similar Papers
No similar papers found.