SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM optimization

📅 2024-01-12

🏛️ AAAI Conference on Artificial Intelligence

📈 Citations: 33

✨ Influential: 2

career value

231K/year

🤖 AI Summary

To address the poor completeness and heavy reliance on manual hyperparameter tuning in multi-view stereo (MVS) reconstruction of textureless regions, this paper proposes a semantic-guided end-to-end MVS framework. Methodologically, it introduces the Segment Anything Model (SAM) into MVS for the first time, enabling instance-level semantic constraints for pixel-wise deformation matching and propagation. It further designs a spherical coordinate representation coupled with normal-gradient refinement and adaptive depth interval search to enhance geometric consistency. Additionally, an Expectation-Maximization (EM)-based joint optimization framework is formulated to simultaneously refine matching costs and hyperparameters, substantially reducing manual tuning effort. Evaluated on ETH3D and Tanks and Temples benchmarks, the method achieves state-of-the-art accuracy while significantly improving inference speed.

Technology Category

Application Category

📝 Abstract

In this paper, we introduce Segmentation-Driven Deformation Multi-View Stereo (SD-MVS), a method that can effectively tackle challenges in 3D reconstruction of textureless areas. We are the first to adopt the Segment Anything Model (SAM) to distinguish semantic instances in scenes and further leverage these constraints for pixelwise patch deformation on both matching cost and propagation. Concurrently, we propose a unique refinement strategy that combines spherical coordinates and gradient descent on normals and pixelwise search interval on depths, significantly improving the completeness of reconstructed 3D model. Furthermore, we adopt the Expectation-Maximization (EM) algorithm to alternately optimize the aggregate matching cost and hyperparameters, effectively mitigating the problem of parameters being excessively dependent on empirical tuning. Evaluations on the ETH3D high-resolution multi-view stereo benchmark and the Tanks and Temples dataset demonstrate that our method can achieve state-of-the-art results with less time consumption.

Problem

Research questions and friction points this paper is trying to address.

Reconstructing 3D models in textureless areas using segmentation and deformation

Improving model completeness via spherical refinement and gradient optimization

Reducing empirical parameter tuning through EM algorithm optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses SAM for semantic instance segmentation constraints

Combines spherical coordinates with gradient descent refinement

Employs EM algorithm for automated parameter optimization

🔎 Similar Papers

MSP-MVS: Multi-granularity Segmentation Prior Guided Multi-View Stereo