🤖 AI Summary
This paper addresses the challenge of generating controllable 3D Gaussian head avatars from a single portrait image with real-time rendering capability. We propose a two-stage framework that decouples reconstruction from reenactment. First, leveraging WebSSL, we construct a large-scale one-click Gaussian head generator trained in two stages to achieve high generalization in geometry reconstruction and faithful preservation of high-frequency texture details. Second, we design an ultra-lightweight driving module that explicitly disentangles geometric and appearance modeling from dynamic control, ensuring that parameter expansion during reconstruction does not compromise driving efficiency. The architecture adheres to scaling laws, enabling fine-grained control over pose, expression, and illumination. At 512×512 resolution, our method achieves 90 FPS real-time rendering. Quantitative and qualitative evaluations demonstrate consistent superiority over state-of-the-art approaches, establishing an efficient and scalable paradigm for high-fidelity 3D head avatar generation from a single input image.
📝 Abstract
In this paper, we explore a reconstruction and reenactment separated framework for 3D Gaussians head, which requires only a single portrait image as input to generate controllable avatar. Specifically, we developed a large-scale one-shot gaussian head generator built upon WebSSL and employed a two-stage training approach that significantly enhances the capabilities of generalization and high-frequency texture reconstruction. During inference, an ultra-lightweight gaussian avatar driven by control signals enables high frame-rate rendering, achieving 90 FPS at a resolution of 512x512. We further demonstrate that the proposed framework follows the scaling law, whereby increasing the parameter scale of the reconstruction module leads to improved performance. Moreover, thanks to the separation design, driving efficiency remains unaffected. Finally, extensive quantitative and qualitative experiments validate that our approach outperforms current state-of-the-art methods.