DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis

📅 2025-03-19

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

Addressing the challenge of generating high-fidelity, view-consistent 360° surround views from a single portrait image, this paper proposes DiffPortrait3D—the first diffusion-based generative framework explicitly designed for 360° geometric and appearance consistency. Methodologically, it introduces a ControlNet-guided posterior head synthesis module, a dual-branch appearance modeling architecture to ensure global geometric and textural coherence between front and back views, and integrates continuous-view sequence training with posterior-head reference image constraints. It achieves, for the first time, real-time, free-viewpoint NeRF rendering driven solely by a single input image. Extensive evaluation demonstrates substantial improvements over state-of-the-art methods on complex portraits (e.g., with eyewear or headwear), yielding locally smooth, globally consistent geometry and texture. The generated outputs directly instantiate high-quality NeRF models, enabling immersive telepresence and personalized content creation.

Technology Category

Application Category

📝 Abstract

Generating high-quality 360-degree views of human heads from single-view images is essential for enabling accessible immersive telepresence applications and scalable personalized content creation. While cutting-edge methods for full head generation are limited to modeling realistic human heads, the latest diffusion-based approaches for style-omniscient head synthesis can produce only frontal views and struggle with view consistency, preventing their conversion into true 3D models for rendering from arbitrary angles. We introduce a novel approach that generates fully consistent 360-degree head views, accommodating human, stylized, and anthropomorphic forms, including accessories like glasses and hats. Our method builds on the DiffPortrait3D framework, incorporating a custom ControlNet for back-of-head detail generation and a dual appearance module to ensure global front-back consistency. By training on continuous view sequences and integrating a back reference image, our approach achieves robust, locally continuous view synthesis. Our model can be used to produce high-quality neural radiance fields (NeRFs) for real-time, free-viewpoint rendering, outperforming state-of-the-art methods in object synthesis and 360-degree head generation for very challenging input portraits.

Problem

Research questions and friction points this paper is trying to address.

Generate 360-degree views from single images

Ensure view consistency for 3D model conversion

Produce high-quality NeRFs for real-time rendering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Custom ControlNet for back-head detail

Dual appearance module for consistency

Neural radiance fields for real-time rendering

🔎 Similar Papers

Single Image, Any Face: Generalisable 3D Face Generation