ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation

📅 2025-10-27

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

To address structural incompleteness in multi-view 3D object reconstruction caused by sparse viewpoints and occlusions, this paper proposes a novel method that tightly integrates generative priors with reconstruction frameworks. Our approach features two key innovations: (1) a reconstruction-aware prior mechanism that strengthens cross-view feature correlation; and (2) a multi-view image-feature-conditioned diffusion model, enhanced by cross-attention optimization and iterative denoising control, to improve controllability of local detail generation and ensure geometric-textural consistency. Experiments demonstrate that our method preserves high fidelity to input views while significantly enhancing both global shape completeness and local geometric accuracy. Quantitative and qualitative evaluations show superior performance over state-of-the-art reconstruction-based and generative methods across standard benchmarks.

Technology Category

Application Category

📝 Abstract

Existing multi-view 3D object reconstruction methods heavily rely on sufficient overlap between input views, where occlusions and sparse coverage in practice frequently yield severe reconstruction incompleteness. Recent advancements in diffusion-based 3D generative techniques offer the potential to address these limitations by leveraging learned generative priors to hallucinate invisible parts of objects, thereby generating plausible 3D structures. However, the stochastic nature of the inference process limits the accuracy and reliability of generation results, preventing existing reconstruction frameworks from integrating such 3D generative priors. In this work, we comprehensively analyze the reasons why diffusion-based 3D generative methods fail to achieve high consistency, including (a) the insufficiency in constructing and leveraging cross-view connections when extracting multi-view image features as conditions, and (b) the poor controllability of iterative denoising during local detail generation, which easily leads to plausible but inconsistent fine geometric and texture details with inputs. Accordingly, we propose ReconViaGen to innovatively integrate reconstruction priors into the generative framework and devise several strategies that effectively address these issues. Extensive experiments demonstrate that our ReconViaGen can reconstruct complete and accurate 3D models consistent with input views in both global structure and local details.Project page: https://jiahao620.github.io/reconviagen.

Problem

Research questions and friction points this paper is trying to address.

Addressing reconstruction incompleteness from occluded and sparse views

Improving consistency of diffusion-based 3D generation with inputs

Enhancing cross-view connection and denoising controllability in reconstruction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses generative priors to reconstruct occluded 3D object parts

Enhances cross-view feature extraction for improved consistency

Controls iterative denoising for accurate local detail generation

🔎 Similar Papers

Generalizable 3D Scene Reconstruction via Divide and Conquer from a Single View