EgoRelight: Egocentric Human Capture and Illumination Recovery for Relightable and Photoreal Avatar Rendering

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the challenge of achieving photorealistic telepresence under the limited field of view inherent to head-mounted mixed reality devices. The paper presents the first end-to-end framework that, using only a single HMD equipped with downward-facing stereo cameras, jointly performs human motion capture, relightable appearance modeling, and HDR environmental lighting estimation. The approach integrates stereo depth estimation with mesh-driven avatar reconstruction to recover geometry and introduces a neural appearance model that disentangles diffuse and specular components without requiring explicit BRDF priors. At test time, inverse rendering recovers an HDR environment map, enabling environment-adaptive, high-fidelity virtual human rendering. Experiments demonstrate that the proposed method significantly outperforms existing techniques in geometric accuracy, rendering realism, and relighting consistency, and has been successfully integrated into an immersive remote social system.

📝 Abstract

Mixed Reality (MR) headsets promise a future of immersive telepresence where virtual humans blend indistinguishably into real or virtual surroundings. Achieving this vision requires a method for capturing a user's motion, estimating appearance under novel lighting, and understanding the environment - all from the constrained viewpoint of a head-mounted display (HMD). Existing approaches treat these as isolated problems: they either focus on driving avatars with baked-in lighting or rely on studio setups for relighting. In this paper, we present EgoRelight, a holistic framework for egocentric telepresence that simultaneously captures full-body human performance, synthesizes photorealistic and relightable appearance, and estimates high dynamic range (HDR) environment maps from a single HMD. First, to ensure motion and surface reconstruction, we propose an egocentric perception module that leverages stereo down-facing cameras to extract dense depth maps, which serve as geometric control signals to drive a mesh-based avatar. Second, we introduce a novel neural appearance model that learns to synthesize view-dependent specular and view-independent diffuse shading separately. By employing a specialized ray-sampling strategy, our model generalizes to unseen illumination without relying on restrictive analytical BRDF priors. Third, we enable seamless avatar integration into the physical world via a test-time inverse rendering process, which recovers an HDR environment map by matching the pre-trained avatar's appearance to live egocentric camera observations. We demonstrate our system through a social telepresence application, where remote users are coherently relit according to their physical environment. Extensive experiments show that our components and the integrated system significantly outperform state-of-the-art baselines in geometric accuracy and rendering as well as relighting fidelity.

Problem

Research questions and friction points this paper is trying to address.

egocentric capture

relightable avatar

photorealistic rendering

HDR environment estimation

telepresence

Innovation

Methods, ideas, or system contributions that make the work stand out.

egocentric capture

relightable avatar

neural appearance model

inverse rendering

HDR environment estimation

🔎 Similar Papers

Surfel-based Gaussian Inverse Rendering for Fast and Relightable Dynamic Human Reconstruction from Monocular Video

2024-07-21IEEE Transactions on Pattern Analysis and Machine IntelligenceCitations: 7