HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors

📅 2024-08-12
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
Existing few-shot 3D head modeling approaches suffer from limited photorealism, poor dynamic expressiveness, and insufficient personalization. This paper proposes a two-stage framework: first, learning a generalizable 3D Gaussian prior from large-scale multi-view dynamic head data; second, leveraging this prior for personalized reconstruction from only 3–5 in-the-wild images. Key contributions include: (i) the first introduction of a generalizable 3D Gaussian prior for few-shot digital human generation; and (ii) a part-wise dynamic Gaussian rasterization autoencoder that jointly optimizes identity-shared and subject-specific latent codes. Experiments demonstrate significant improvements over state-of-the-art methods in photorealistic rendering quality, multi-view consistency, and skeletal-driven animation stability.

Technology Category

Application Category

📝 Abstract
In this paper, we present a novel 3D head avatar creation approach capable of generalizing from few-shot in-the-wild data with high-fidelity and animatable robustness. Given the underconstrained nature of this problem, incorporating prior knowledge is essential. Therefore, we propose a framework comprising prior learning and avatar creation phases. The prior learning phase leverages 3D head priors derived from a large-scale multi-view dynamic dataset, and the avatar creation phase applies these priors for few-shot personalization. Our approach effectively captures these priors by utilizing a Gaussian Splatting-based auto-decoder network with part-based dynamic modeling. Our method employs identity-shared encoding with personalized latent codes for individual identities to learn the attributes of Gaussian primitives. During the avatar creation phase, we achieve fast head avatar personalization by leveraging inversion and fine-tuning strategies. Extensive experiments demonstrate that our model effectively exploits head priors and successfully generalizes them to few-shot personalization, achieving photo-realistic rendering quality, multi-view consistency, and stable animation.
Problem

Research questions and friction points this paper is trying to address.

3D head modeling
limited sample size
personalization and realism
Innovation

Methods, ideas, or system contributions that make the work stand out.

HeadGAP
High-fidelity 3D head modeling
Dynamic representation learning
🔎 Similar Papers
No similar papers found.
Xiaozheng Zheng
Xiaozheng Zheng
ByteDance Seed
Computer VisionPose EstimationNeural RenderingVideo Generation
Chao Wen
Chao Wen
ByteDance
Computer Vision
Z
Zhaohu Li
ByteDance
W
Weiyi Zhang
ByteDance
Z
Zhuo Su
ByteDance
X
Xu Chang
ByteDance
Y
Yang Zhao
ByteDance
Z
Zheng Lv
ByteDance
Xiaoyuan Zhang
Xiaoyuan Zhang
Peking University
Multi-Agent LearningReinforcement Learning
Y
Yongjie Zhang
ByteDance
G
Guidong Wang
ByteDance
L
Lan Xu
ShanghaiTech University