SPGen: Spherical Projection as Consistent and Flexible Representation for Single Image 3D Shape Generation

📅 2025-09-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing single-view 3D generation methods rely on multi-view diffusion priors, suffering from view inconsistency and struggling to model complex internal structures and non-trivial topologies. To address this, we propose Spherical Projection (SP), a novel 3D representation that maps shape geometry onto an enclosing sphere and unfolds it into a multi-layer 2D structure, enabling consistent and flexible single-image-driven reconstruction. By incorporating a single-view injection mechanism into the SP mapping, we eliminate inter-view contradictions and support generation of watertight/open surfaces as well as nested internal structures. Joint fine-tuning of a 2D geometric encoder and 2D diffusion priors over the SP representation ensures computational efficiency while significantly improving geometric fidelity. Experiments demonstrate that our method surpasses current state-of-the-art approaches under limited computational resources, achieving simultaneous advances in reconstruction consistency, topological expressiveness, and inference efficiency.

Technology Category

Application Category

📝 Abstract
Existing single-view 3D generative models typically adopt multiview diffusion priors to reconstruct object surfaces, yet they remain prone to inter-view inconsistencies and are unable to faithfully represent complex internal structure or nontrivial topologies. In particular, we encode geometry information by projecting it onto a bounding sphere and unwrapping it into a compact and structural multi-layer 2D Spherical Projection (SP) representation. Operating solely in the image domain, SPGen offers three key advantages simultaneously: (1) Consistency. The injective SP mapping encodes surface geometry with a single viewpoint which naturally eliminates view inconsistency and ambiguity; (2) Flexibility. Multi-layer SP maps represent nested internal structures and support direct lifting to watertight or open 3D surfaces; (3) Efficiency. The image-domain formulation allows the direct inheritance of powerful 2D diffusion priors and enables efficient finetuning with limited computational resources. Extensive experiments demonstrate that SPGen significantly outperforms existing baselines in geometric quality and computational efficiency.
Problem

Research questions and friction points this paper is trying to address.

Addresses inter-view inconsistencies in single-view 3D generation
Represents complex internal structures and nontrivial topologies
Enables efficient 3D shape generation using 2D diffusion priors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spherical projection for 3D shape representation
Multi-layer 2D mapping eliminates view inconsistency
Image-domain formulation enables efficient diffusion priors
🔎 Similar Papers
No similar papers found.
Jingdong Zhang
Jingdong Zhang
Chapman-Schmidt Fellow, Imperial College London
Stochastic ControlMachine learningComplex SystemPhysics
Weikai Chen
Weikai Chen
Principal Research Scientist, Tencent America
3D AIGC3D VisionComputer graphicsVLM
Y
Yuan Liu
Hong Kong University of Science and Technology, China
J
Jionghao Wang
Texas A&M University, USA
Z
Zhengming Yu
Texas A&M University, USA
Z
Zhuowen Shen
Texas A&M University, USA
B
Bo Yang
Waymo, USA
Wenping Wang
Wenping Wang
Texas A&M University
Computer GraphicsGeometric Computing
X
Xin Li
Texas A&M University, USA