Multi-view Reconstruction via SfM-guided Monocular Depth Estimation

📅 2025-03-18

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the limitation of monocular depth estimation accuracy in multi-view geometry reconstruction. To bridge this gap, we explicitly embed multi-view geometric priors derived from Structure-from-Motion (SfM) sparse reconstructions into a monocular depth estimation framework. Methodologically, we propose a geometry-guided depth network, an end-to-end trainable geometric consistency loss, and a multi-scale feature alignment module—enabling tight coupling between monocular depth and multi-view geometry without explicit Multi-View Stereo (MVS) optimization. Our key contribution is the first integration of SfM-derived geometric constraints as strong, direct supervision within the monocular depth learning pipeline. Experiments demonstrate that our method achieves significantly higher depth prediction accuracy than state-of-the-art monocular approaches. Moreover, on diverse real-world scenes—including indoor, street-view, and aerial imagery—our reconstructed 3D geometry consistently surpasses current best MVS methods in quality.

Technology Category

Application Category

📝 Abstract

In this paper, we present a new method for multi-view geometric reconstruction. In recent years, large vision models have rapidly developed, performing excellently across various tasks and demonstrating remarkable generalization capabilities. Some works use large vision models for monocular depth estimation, which have been applied to facilitate multi-view reconstruction tasks in an indirect manner. Due to the ambiguity of the monocular depth estimation task, the estimated depth values are usually not accurate enough, limiting their utility in aiding multi-view reconstruction. We propose to incorporate SfM information, a strong multi-view prior, into the depth estimation process, thus enhancing the quality of depth prediction and enabling their direct application in multi-view geometric reconstruction. Experimental results on public real-world datasets show that our method significantly improves the quality of depth estimation compared to previous monocular depth estimation works. Additionally, we evaluate the reconstruction quality of our approach in various types of scenes including indoor, streetscape, and aerial views, surpassing state-of-the-art MVS methods. The code and supplementary materials are available at https://zju3dv.github.io/murre/ .

Problem

Research questions and friction points this paper is trying to address.

Improves monocular depth estimation accuracy using SfM information.

Enhances multi-view geometric reconstruction quality across diverse scenes.

Surpasses state-of-the-art MVS methods in reconstruction performance.

Innovation

Methods, ideas, or system contributions that make the work stand out.

SfM-guided monocular depth estimation

Enhanced depth prediction for reconstruction

Outperforms state-of-the-art MVS methods

🔎 Similar Papers

Refinement of Monocular Depth Maps via Multi-View Differentiable Rendering