SplitAvatar: One-shot Head Avatar with Autoregressive Gaussian Splitting

📅 2026-05-25

📈 Citations: 0

✨ Influential: 0

career value

256K/year

🤖 AI Summary

Existing 3D Gaussian splatting–based methods for animatable head avatar reconstruction struggle to preserve fine-grained facial expression details due to a mismatch between the number of Gaussians driven by input images and those generated by 3D morphable models (3DMMs). To address this, this work proposes an autoregressive graph splitting network that constructs a structurally consistent and density-adaptive animatable head avatar from a single image. The approach progressively refines the representation through dynamic mesh topology expansion, autoregressive Gaussian splitting, and a soft-masked density gating mechanism. By integrating graph neural networks with a deferred filtering training strategy, the method achieves high-fidelity, fine-grained expression modeling and real-time rendering from a single input image, significantly enhancing both facial detail reconstruction quality and expressiveness.

📝 Abstract

3D Gaussian Splatting (3DGS) provides an efficient method for high-quality scene reconstruction using anisotropic Gaussians. Recently, 3DGS-based methods have significantly improved the rendering quality of human avatars while enabling real-time performance. However, existing methods suffer from a magnitude mismatch in the number of Gaussians generated by image-based and 3DMM-based approaches. This discrepancy results in reconstructed expressions that lack fine-grained detail. In this paper, we introduce a novel method for reconstructing an animatable head avatar from a single image. We propose a Graph splitting network to progressively generate Gaussians from coarse to fine using an autoregressive architecture. To address the graph inconsistency caused by split Gaussians, we employ a mesh topology extension method to align the GNN's connectivity with the increased Gaussian count. Furthermore, we introduce a novel density control method that includes a gating mechanism that generates soft masks for Gaussians, preventing over-densification after the splitting operation. This allows for dynamic control over Gaussian density across different facial regions. For smooth and rapid training, we employ a delayed filtering strategy to avoid re-computing the graph topology during training. Experimental results demonstrate that our autoregressive structure effectively improves expression representation ability by progressively splitting Gaussians. This process, enabled by the GNN-guided splitting, synthesizes more precise facial details and achieves higher reconstruction quality.

Problem

Research questions and friction points this paper is trying to address.

3D Gaussian Splatting

head avatar

expression detail

Gaussian density

one-shot reconstruction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoregressive Gaussian Splitting

3D Gaussian Splatting

Graph Neural Network