🤖 AI Summary
This paper addresses three core challenges in monocular video-based reconstruction of high-fidelity, relightable 3D head models: geometric ambiguity under single-view observation, inaccurate modeling of facial expression deformations, and rendering artifacts under novel illumination. To this end, we propose an end-to-end differentiable optimization framework. Methodologically: (1) We introduce a novel joint optimization strategy to suppress facial tracking errors; (2) we incorporate learnable blendshapes with linear blend skinning (LBS) to enhance personalized expression modeling; and (3) we decouple surface physical attributes—albedo, normals, and roughness—and integrate physics-based BRDF shading with environment lighting encoding for illumination-invariant appearance modeling. We adopt 3D Gaussian Splatting (3DGS) as the geometric representation to enable efficient differentiable rendering. Our method achieves state-of-the-art performance across PSNR, SSIM, and LPIPS on multiple benchmarks, while supporting real-time rendering and photorealistic relighting under arbitrary environmental lighting.
📝 Abstract
Reconstructing animatable and high-quality 3D head avatars from monocular videos, especially with realistic relighting, is a valuable task. However, the limited information from single-view input, combined with the complex head poses and facial movements, makes this challenging. Previous methods achieve real-time performance by combining 3D Gaussian Splatting with a parametric head model, but the resulting head quality suffers from inaccurate face tracking and limited expressiveness of the deformation model. These methods also fail to produce realistic effects under novel lighting conditions. To address these issues, we propose HRAvatar, a 3DGS-based method that reconstructs high-fidelity, relightable 3D head avatars. HRAvatar reduces tracking errors through end-to-end optimization and better captures individual facial deformations using learnable blendshapes and learnable linear blend skinning. Additionally, it decomposes head appearance into several physical properties and incorporates physically-based shading to account for environmental lighting. Extensive experiments demonstrate that HRAvatar not only reconstructs superior-quality heads but also achieves realistic visual effects under varying lighting conditions.