LightHeadEd: Relightable&Editable Head Avatars from a Smartphone

📅 2025-04-13

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

To address the reliance of high-fidelity, relightable, and editable 3D head reconstruction on costly multi-camera systems, this paper proposes a lightweight solution based on a single smartphone—requiring only a polarizing filter, a point light source, and a darkroom environment for dynamic facial video capture. Methodologically, we introduce a novel polarization-based dichroic separation of skin’s diffuse and specular reflectance; propose a hybrid representation embedding 2D Gaussians in UV space; and design a neural analysis-synthesis framework that explicitly decouples geometry deformation from appearance. Key technical components include cross- and co-polarized video acquisition, parametric UV mapping, differentiable ray tracing, environment light map estimation, and a newly curated multi-subject facial motion dataset. Experiments demonstrate that our method achieves geometric and material fidelity comparable to Light Stage on real smartphone-captured data, enabling real-time rendering, arbitrary relighting, and pose- or expression-driven animation.

Technology Category

Application Category

📝 Abstract

Creating photorealistic, animatable, and relightable 3D head avatars traditionally requires expensive Lightstage with multiple calibrated cameras, making it inaccessible for widespread adoption. To bridge this gap, we present a novel, cost-effective approach for creating high-quality relightable head avatars using only a smartphone equipped with polaroid filters. Our approach involves simultaneously capturing cross-polarized and parallel-polarized video streams in a dark room with a single point-light source, separating the skin's diffuse and specular components during dynamic facial performances. We introduce a hybrid representation that embeds 2D Gaussians in the UV space of a parametric head model, facilitating efficient real-time rendering while preserving high-fidelity geometric details. Our learning-based neural analysis-by-synthesis pipeline decouples pose and expression-dependent geometrical offsets from appearance, decomposing the surface into albedo, normal, and specular UV texture maps, along with the environment maps. We collect a unique dataset of various subjects performing diverse facial expressions and head movements.

Problem

Research questions and friction points this paper is trying to address.

Creating affordable relightable head avatars from smartphones

Separating diffuse and specular components in facial performances

Decoupling pose and expression from appearance for realistic rendering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Smartphone with polaroid filters captures polarized videos

Hybrid UV-embedded 2D Gaussians enable real-time rendering

Neural pipeline decouples geometry and appearance for relighting

🔎 Similar Papers

MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices