๐ค AI Summary
Generating high-fidelity, animatable 3D facial avatars remains challenging due to slow rendering and dynamic inconsistency in implicit representations (e.g., NeRF), and the lack of dynamic controllability in 3D Gaussian Splatting (3DGS). This work proposes the first real-time drivable 3D Gaussian facial avatar method. It introduces a lightweight FLAME-conditioned deformation branch to predict residual Gaussian point displacements for fine-grained expression modeling; designs a dual-discriminator GAN framework to enhance motion realism; and jointly optimizes via parametric mesh supervision and synthetic rendering losses. Our approach achieves the first CPU-based real-time inference (~9 FPS) and GPU inference exceeding 250 FPSโsignificantly outperforming NeRF-based baselines. It delivers concurrent advances in identity preservation, expression accuracy, and rendering efficiency.
๐ Abstract
The generation of high-fidelity, animatable 3D human avatars remains a core challenge in computer graphics and vision, with applications in VR, telepresence, and entertainment. Existing approaches based on implicit representations like NeRFs suffer from slow rendering and dynamic inconsistencies, while 3D Gaussian Splatting (3DGS) methods are typically limited to static head generation, lacking dynamic control. We bridge this gap by introducing AGORA, a novel framework that extends 3DGS within a generative adversarial network to produce animatable avatars. Our key contribution is a lightweight, FLAME-conditioned deformation branch that predicts per-Gaussian residuals, enabling identity-preserving, fine-grained expression control while allowing real-time inference. Expression fidelity is enforced via a dual-discriminator training scheme leveraging synthetic renderings of the parametric mesh. AGORA generates avatars that are not only visually realistic but also precisely controllable. Quantitatively, we outperform state-of-the-art NeRF-based methods on expression accuracy while rendering at 250+ FPS on a single GPU, and, notably, at $sim$9 FPS under CPU-only inference - representing, to our knowledge, the first demonstration of practical CPU-only animatable 3DGS avatar synthesis. This work represents a significant step toward practical, high-performance digital humans. Project website: https://ramazan793.github.io/AGORA/