GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

📅 2026-05-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

211K/year
🤖 AI Summary
This work addresses the challenge of high-fidelity, editable 3D reconstruction of large-scale indoor scenes from multi-view RGB images. The proposed method decomposes the scene into spatially overlapping local blocks and extends object-level generative priors—such as Trellis-2—to the full scene scale for the first time. It introduces a view-invariant, spatially anchored 3D conditional generation mechanism that fuses multi-view image features through projective conditioning to guide the generative model toward outputs consistent with observed data. This approach substantially improves reconstruction quality, outperforming state-of-the-art methods by 16% on indoor scenes, while producing multi-view consistent, physically based rendering (PBR)-ready, and editable meshes.
📝 Abstract
We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we use Trellis.2 as an example -- which we generalize to the scene level. To this end, we propose a projection-based conditioning mechanism that lifts posed multi-view image features into a coherent 3D representation aligned with the generative model, independent of view ordering and spatially anchored to the scene, yielding high-fidelity, multi-view consistent generated geometry. This enables lifting the strong object-level prior of Trellis.2 to multi-view, scene-scale generation, producing faithful, editable PBR mesh reconstructions of indoor environments. As a result, we obtain high-fidelity results that outperform cutting-edge reconstruction methods by 16%.
Problem

Research questions and friction points this paper is trying to address.

3D scene reconstruction
multi-view RGB images
generative prior
high-fidelity geometry
scene-scale generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Prior
Multi-View 3D Reconstruction
Conditional 3D Generation
Scene-Scale Modeling
Projection-Based Conditioning