Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

In generative 3D reconstruction, narrow-baseline local observations—such as single-view or small-FOV images—pose two key challenges: severe viewpoint limitation and inconsistent generation of occluded regions. To address these, this paper introduces the first fine-tuning-free zero-shot fusion framework. Methodologically, it innovatively enforces joint alignment of multi-source priors (textual, CLIP-based, and depth) within the DDIM sampling process, and incorporates an iterative, geometry-guided implicit field optimization to synergistically integrate local dense observations with semantic and geometric priors. Experiments demonstrate that our approach significantly outperforms state-of-the-art methods across multiple benchmarks. Notably, it achieves breakthrough improvements in visual consistency and geometric completeness of unobserved regions, enabling, for the first time, high-fidelity, globally consistent 3D zero-shot reconstruction from narrow-baseline inputs alone.

Technology Category

Application Category

📝 Abstract

Generative 3D reconstruction shows strong potential in incomplete observations. While sparse-view and single-image reconstruction are well-researched, partial observation remains underexplored. In this context, dense views are accessible only from a specific angular range, with other perspectives remaining inaccessible. This task presents two main challenges: (i) limited View Range: observations confined to a narrow angular scope prevent effective traditional interpolation techniques that require evenly distributed perspectives. (ii) inconsistent Generation: views created for invisible regions often lack coherence with both visible regions and each other, compromising reconstruction consistency. To address these challenges, we propose method, a novel training-free approach that integrates the local dense observations and multi-source priors for reconstruction. Our method introduces a fusion-based strategy to effectively align these priors in DDIM sampling, thereby generating multi-view consistent images to supervise invisible views. We further design an iterative refinement strategy, which uses the geometric structures of the object to enhance reconstruction quality. Extensive experiments on multiple datasets show the superiority of our method over SOTAs, especially in invisible regions.

Problem

Research questions and friction points this paper is trying to address.

Addresses 3D reconstruction from partial-view images with limited angular range

Solves inconsistent generation of views for invisible regions

Enhances reconstruction quality using geometric structures and multi-source priors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free approach with multi-source priors

Fusion-based strategy for DDIM sampling alignment

Iterative refinement using object geometric structures

🔎 Similar Papers

Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View