🤖 AI Summary
Existing 3D-aware face generators are constrained by frontal-biased 2D face data, limiting their ability to model complete head–neck–shoulder (¼-head) geometry. This stems from the difficulty of detecting large-angle or rear-facing faces and severe geometric ambiguity arising from pose diversity under single-view supervision. To address this, we propose the first single-view, 3D-consistent generation method for ¼-head portraits. We introduce 360°PHQ—the first high-quality, single-view portrait dataset covering full 360° azimuthal viewpoints—and design an implicit 3D GAN framework that jointly optimizes camera parameters, self-supervised body pose estimation, and multi-view consistency constraints. Experiments demonstrate that our method generates high-fidelity, geometrically complete, and pose-accurate ¼-head 3D portraits across all viewpoints. Both qualitative and quantitative evaluations show significant improvements over state-of-the-art methods.
📝 Abstract
3D-aware face generators are typically trained on 2D real-life face image datasets that primarily consist of near-frontal face data, and as such, they are unable to construct one-quarter headshot 3D portraits with complete head, neck, and shoulder geometry. Two reasons account for this issue: First, existing facial recognition methods struggle with extracting facial data captured from large camera angles or back views. Second, it is challenging to learn a distribution of 3D portraits covering the one-quarter headshot region from single-view data due to significant geometric deformation caused by diverse body poses. To this end, we first create the dataset 360{deg}-Portrait-HQ (360{deg}PHQ for short) which consists of high-quality single-view real portraits annotated with a variety of camera parameters (the yaw angles span the entire 360{deg} range) and body poses. We then propose 3DPortraitGAN, the first 3D-aware one-quarter headshot portrait generator that learns a canonical 3D avatar distribution from the 360{deg}PHQ dataset with body pose self-learning. Our model can generate view-consistent portrait images from all camera angles with a canonical one-quarter headshot 3D representation. Our experiments show that the proposed framework can accurately predict portrait body poses and generate view-consistent, realistic portrait images with complete geometry from all camera angles.