Gaussian Pixel Codec Avatars: A Hybrid Representation for Efficient Rendering

📅 2025-12-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of real-time, high-fidelity head avatar rendering on mobile devices. We propose a hybrid translucent representation combining triangle meshes and anisotropic 3D Gaussians: meshes model surface structures (e.g., skin), while 3D Gaussian voxels capture complex non-surface details (e.g., hair). We pioneer the integration of meshes into a differentiable Gaussian splatting rendering framework, establishing a unified differentiable rendering pipeline. Coupled with a neural decoding network, multi-view image supervision, and RGBA texture synthesis, our method achieves high-quality translucent rendering. Quantitatively and qualitatively, visual fidelity matches that of pure 3D Gaussian approaches, while rendering speed reaches conventional mesh-level performance and GPU memory consumption is significantly reduced. To our knowledge, this is the first method enabling real-time, high-fidelity head rendering on mobile platforms.

Technology Category

Application Category

📝 Abstract
We present Gaussian Pixel Codec Avatars (GPiCA), photorealistic head avatars that can be generated from multi-view images and efficiently rendered on mobile devices. GPiCA utilizes a unique hybrid representation that combines a triangle mesh and anisotropic 3D Gaussians. This combination maximizes memory and rendering efficiency while maintaining a photorealistic appearance. The triangle mesh is highly efficient in representing surface areas like facial skin, while the 3D Gaussians effectively handle non-surface areas such as hair and beard. To this end, we develop a unified differentiable rendering pipeline that treats the mesh as a semi-transparent layer within the volumetric rendering paradigm of 3D Gaussian Splatting. We train neural networks to decode a facial expression code into three components: a 3D face mesh, an RGBA texture, and a set of 3D Gaussians. These components are rendered simultaneously in a unified rendering engine. The networks are trained using multi-view image supervision. Our results demonstrate that GPiCA achieves the realism of purely Gaussian-based avatars while matching the rendering performance of mesh-based avatars.
Problem

Research questions and friction points this paper is trying to address.

Creates photorealistic head avatars from multi-view images.
Efficiently renders avatars on mobile devices using hybrid representation.
Combines mesh for skin and Gaussians for hair and beard.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid representation combining mesh and 3D Gaussians
Unified differentiable rendering pipeline for simultaneous rendering
Neural networks decode expression into mesh, texture, and Gaussians
🔎 Similar Papers
No similar papers found.