Gaussian Eigen Models for Human Heads

๐Ÿ“… 2024-07-05
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 2
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing personalized avatars struggle to balance photorealism with computational efficiency. This paper introduces GEM, a lightweight, high-fidelity, and controllable digital avatar. GEM pioneers the adaptation of 3D Morphable Model (3DMM) principles to the 3D Gaussian space: it distills a CNN-based neural renderer via PCA to construct linear eigenbases for expression-adaptive position, scale, rotation, and opacityโ€”enabling wrinkle-level detail reconstruction from low-dimensional parameters and single-image driving. The method synergizes 3D Gaussian modeling, Gaussian rasterization, and linear eigen-space control. In self- and cross-subject reenactment benchmarks, GEM surpasses state-of-the-art methods in visual quality and unseen-expression generalization, while reducing model size by over 90% and enabling real-time inference on consumer-grade hardware.

Technology Category

Application Category

๐Ÿ“ Abstract
Current personalized neural head avatars face a trade-off: lightweight models lack detail and realism, while high-quality, animatable avatars require significant computational resources, making them unsuitable for commodity devices. To address this gap, we introduce Gaussian Eigen Models (GEM), which provide high-quality, lightweight, and easily controllable head avatars. GEM utilizes 3D Gaussian primitives for representing the appearance combined with Gaussian splatting for rendering. Building on the success of mesh-based 3D morphable face models (3DMM), we define GEM as an ensemble of linear eigenbases for representing the head appearance of a specific subject. In particular, we construct linear bases to represent the position, scale, rotation, and opacity of the 3D Gaussians. This allows us to efficiently generate Gaussian primitives of a specific head shape by a linear combination of the basis vectors, only requiring a low-dimensional parameter vector that contains the respective coefficients. We propose to construct these linear bases (GEM) by distilling high-quality compute-intense CNN-based Gaussian avatar models that can generate expression-dependent appearance changes like wrinkles. These high-quality models are trained on multi-view videos of a subject and are distilled using a series of principal component analyses. Once we have obtained the bases that represent the animatable appearance space of a specific human, we learn a regressor that takes a single RGB image as input and predicts the low-dimensional parameter vector that corresponds to the shown facial expression. In a series of experiments, we compare GEM's self-reenactment and cross-person reenactment results to state-of-the-art 3D avatar methods, demonstrating GEM's higher visual quality and better generalization to new expressions.
Problem

Research questions and friction points this paper is trying to address.

Realism
Computational Efficiency
Personalized Virtual Avatars
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Feature Model
3D avatar personalization
efficient rendering
๐Ÿ”Ž Similar Papers
No similar papers found.