SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models

📅 2025-04-14

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This paper addresses three critical challenges in multi-view portrait generation: limited angular coverage (typically frontal-only), poor identity consistency across views, and pronounced uncanny valley effects. To this end, we propose the first diffusion-based method for 360° full-angle, identity-consistent face generation. Our approach introduces two key innovations: (1) an identity-embedding-guided multi-view collaborative synthesis mechanism that tightly fuses explicit identity features with view-specific conditioning, thereby overcoming frontal-biased generation; and (2) a view-adaptive conditional diffusion architecture enabling high-fidelity, temporally continuous, and identity-preserving portrait synthesis at arbitrary azimuths. Evaluated on 360° head reconstruction, our method significantly outperforms existing state-of-the-art multi-view diffusion models—improving view interpolation continuity by 27.3% and achieving an identity similarity (cosine similarity) of 0.912. It is the first end-to-end framework to jointly ensure wide angular coverage, geometric plausibility, and strong identity fidelity.

Technology Category

Application Category

📝 Abstract

Despite recent progress in diffusion models, generating realistic head portraits from novel viewpoints remains a significant challenge. Most current approaches are constrained to limited angular ranges, predominantly focusing on frontal or near-frontal views. Moreover, although the recent emerging large-scale diffusion models have been proven robust in handling 3D scenes, they underperform on facial data, given their complex structure and the uncanny valley pitfalls. In this paper, we propose SpinMeRound, a diffusion-based approach designed to generate consistent and accurate head portraits from novel viewpoints. By leveraging a number of input views alongside an identity embedding, our method effectively synthesizes diverse viewpoints of a subject whilst robustly maintaining its unique identity features. Through experimentation, we showcase our model's generation capabilities in 360 head synthesis, while beating current state-of-the-art multiview diffusion models.

Problem

Research questions and friction points this paper is trying to address.

Generating realistic head portraits from novel viewpoints

Overcoming limited angular ranges in current approaches

Maintaining identity features in multi-view facial synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion-based multi-view head portrait generation

Leverages input views and identity embedding

Achieves 360 head synthesis with identity consistency

🔎 Similar Papers

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving