EgoRelight: Egocentric Human Capture and Illumination Recovery for Relightable and Photoreal Avatar Rendering

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of achieving photorealistic telepresence under the limited field of view inherent to head-mounted mixed reality devices. The paper presents the first end-to-end framework that, using only a single HMD equipped with downward-facing stereo cameras, jointly performs human motion capture, relightable appearance modeling, and HDR environmental lighting estimation. The approach integrates stereo depth estimation with mesh-driven avatar reconstruction to recover geometry and introduces a neural appearance model that disentangles diffuse and specular components without requiring explicit BRDF priors. At test time, inverse rendering recovers an HDR environment map, enabling environment-adaptive, high-fidelity virtual human rendering. Experiments demonstrate that the proposed method significantly outperforms existing techniques in geometric accuracy, rendering realism, and relighting consistency, and has been successfully integrated into an immersive remote social system.
📝 Abstract
Mixed Reality (MR) headsets promise a future of immersive telepresence where virtual humans blend indistinguishably into real or virtual surroundings. Achieving this vision requires a method for capturing a user's motion, estimating appearance under novel lighting, and understanding the environment - all from the constrained viewpoint of a head-mounted display (HMD). Existing approaches treat these as isolated problems: they either focus on driving avatars with baked-in lighting or rely on studio setups for relighting. In this paper, we present EgoRelight, a holistic framework for egocentric telepresence that simultaneously captures full-body human performance, synthesizes photorealistic and relightable appearance, and estimates high dynamic range (HDR) environment maps from a single HMD. First, to ensure motion and surface reconstruction, we propose an egocentric perception module that leverages stereo down-facing cameras to extract dense depth maps, which serve as geometric control signals to drive a mesh-based avatar. Second, we introduce a novel neural appearance model that learns to synthesize view-dependent specular and view-independent diffuse shading separately. By employing a specialized ray-sampling strategy, our model generalizes to unseen illumination without relying on restrictive analytical BRDF priors. Third, we enable seamless avatar integration into the physical world via a test-time inverse rendering process, which recovers an HDR environment map by matching the pre-trained avatar's appearance to live egocentric camera observations. We demonstrate our system through a social telepresence application, where remote users are coherently relit according to their physical environment. Extensive experiments show that our components and the integrated system significantly outperform state-of-the-art baselines in geometric accuracy and rendering as well as relighting fidelity.
Problem

Research questions and friction points this paper is trying to address.

egocentric capture
relightable avatar
photorealistic rendering
HDR environment estimation
telepresence
Innovation

Methods, ideas, or system contributions that make the work stand out.

egocentric capture
relightable avatar
neural appearance model
inverse rendering
HDR environment estimation
🔎 Similar Papers
2024-07-21IEEE Transactions on Pattern Analysis and Machine IntelligenceCitations: 7