Relightable and Dynamic Gaussian Avatar Reconstruction from Monocular Video

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Monocular video-based reconstruction of photorealistic, relightable, and animatable human avatars suffers from rendering artifacts due to missing motion-coupled geometric details—especially clothing wrinkles—under pose and motion changes. Method: This paper proposes a dynamic human modeling framework built upon 3D Gaussian Splatting (3DGS). It jointly models pose-dependent deformation and motion-induced deformation via learnable dynamic skinning weights, and introduces geometry-aware regularization to enhance fine-grained geometric recovery from sparse views. Contribution/Results: We construct the first multi-view relighting evaluation dataset with multi-illumination annotations. Our method achieves state-of-the-art performance in novel-view synthesis, pose-conditioned rendering, and relighting—enabling neural lighting-decoupled, photorealistic dynamic rendering under arbitrary illumination conditions.

Technology Category

Application Category

📝 Abstract
Modeling relightable and animatable human avatars from monocular video is a long-standing and challenging task. Recently, Neural Radiance Field (NeRF) and 3D Gaussian Splatting (3DGS) methods have been employed to reconstruct the avatars. However, they often produce unsatisfactory photo-realistic results because of insufficient geometrical details related to body motion, such as clothing wrinkles. In this paper, we propose a 3DGS-based human avatar modeling framework, termed as Relightable and Dynamic Gaussian Avatar (RnD-Avatar), that presents accurate pose-variant deformation for high-fidelity geometrical details. To achieve this, we introduce dynamic skinning weights that define the human avatar's articulation based on pose while also learning additional deformations induced by body motion. We also introduce a novel regularization to capture fine geometric details under sparse visual cues. Furthermore, we present a new multi-view dataset with varied lighting conditions to evaluate relight. Our framework enables realistic rendering of novel poses and views while supporting photo-realistic lighting effects under arbitrary lighting conditions. Our method achieves state-of-the-art performance in novel view synthesis, novel pose rendering, and relighting.
Problem

Research questions and friction points this paper is trying to address.

Reconstructs relightable, animatable human avatars from monocular video
Improves geometric details like clothing wrinkles for realistic motion
Enables novel pose and view rendering under arbitrary lighting conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic skinning weights for pose-variant deformation
Regularization capturing fine geometric details
Multi-view dataset for relighting evaluation
🔎 Similar Papers
No similar papers found.