GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction

📅 2026-05-22

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the challenge of high-fidelity, editable 3D reconstruction of large-scale indoor scenes from multi-view RGB images. The proposed method decomposes the scene into spatially overlapping local blocks and extends object-level generative priors—such as Trellis-2—to the full scene scale for the first time. It introduces a view-invariant, spatially anchored 3D conditional generation mechanism that fuses multi-view image features through projective conditioning to guide the generative model toward outputs consistent with observed data. This approach substantially improves reconstruction quality, outperforming state-of-the-art methods by 16% on indoor scenes, while producing multi-view consistent, physically based rendering (PBR)-ready, and editable meshes.

📝 Abstract

We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we use Trellis.2 as an example -- which we generalize to the scene level. To this end, we propose a projection-based conditioning mechanism that lifts posed multi-view image features into a coherent 3D representation aligned with the generative model, independent of view ordering and spatially anchored to the scene, yielding high-fidelity, multi-view consistent generated geometry. This enables lifting the strong object-level prior of Trellis.2 to multi-view, scene-scale generation, producing faithful, editable PBR mesh reconstructions of indoor environments. As a result, we obtain high-fidelity results that outperform cutting-edge reconstruction methods by 16%.

Problem

Research questions and friction points this paper is trying to address.

3D scene reconstruction

multi-view RGB images

generative prior

high-fidelity geometry

scene-scale generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Prior

Multi-View 3D Reconstruction

Conditional 3D Generation