PanoLAM: Large Avatar Model for Gaussian Full-Head Synthesis from One-shot Unposed Image

📅 2025-09-09

📈 Citations: 0

✨ Influential: 0

career value

272K/year

🤖 AI Summary

This work addresses high-fidelity Gaussian splatting reconstruction of the full head from a single image without pose constraints. The proposed method is an efficient, forward-only synthesis approach that eliminates the need for GAN inversion or test-time optimization. It employs a coarse-to-fine dual-branch architecture integrating spherical triplanes and point features, incorporating a Transformer-based feature interaction module, FLAME-guided sparse landmarks, point cloud densification, and triplane feature extraction—all regularized by a pre-trained 3D GAN prior. Crucially, the entire model is trained exclusively on synthetic data, removing reliance on real-world 3D head scans. Experiments demonstrate that our method achieves reconstruction fidelity competitive with optimization-based approaches in a single forward pass, significantly improving both speed and quality. This establishes a novel paradigm for real-time Gaussian head avatar generation.

Technology Category

Application Category

📝 Abstract

We present a feed-forward framework for Gaussian full-head synthesis from a single unposed image. Unlike previous work that relies on time-consuming GAN inversion and test-time optimization, our framework can reconstruct the Gaussian full-head model given a single unposed image in a single forward pass. This enables fast reconstruction and rendering during inference. To mitigate the lack of large-scale 3D head assets, we propose a large-scale synthetic dataset from trained 3D GANs and train our framework using only synthetic data. For efficient high-fidelity generation, we introduce a coarse-to-fine Gaussian head generation pipeline, where sparse points from the FLAME model interact with the image features by transformer blocks for feature extraction and coarse shape reconstruction, which are then densified for high-fidelity reconstruction. To fully leverage the prior knowledge residing in pretrained 3D GANs for effective reconstruction, we propose a dual-branch framework that effectively aggregates the structured spherical triplane feature and unstructured point-based features for more effective Gaussian head reconstruction. Experimental results show the effectiveness of our framework towards existing work.

Problem

Research questions and friction points this paper is trying to address.

Synthesizing 3D Gaussian full-head models from single unposed images

Eliminating reliance on slow GAN inversion and optimization

Addressing lack of large-scale 3D head training data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Feed-forward framework for Gaussian full-head synthesis

Coarse-to-fine Gaussian generation pipeline with transformer

Dual-branch framework combining spherical and point features

🔎 Similar Papers

HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors