Surf3R: Rapid Surface Reconstruction from Sparse RGB Views in Seconds

📅 2025-08-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing multi-view 3D reconstruction methods rely heavily on precise camera calibration and accurate pose estimation, resulting in complex preprocessing and poor deployability. This paper proposes an end-to-end feedforward framework that reconstructs surfaces from sparse RGB images alone—without requiring pose priors—and completes reconstruction within 10 seconds. Our approach introduces two key innovations: (1) a multi-branch multi-view decoder integrated with cross-view attention to enforce geometric consistency across views; and (2) D-Normal regularization based on 3D Gaussian representations, which jointly optimizes surface normals and geometric parameters to enhance fine-grained detail fidelity. Evaluated on ScanNet++ and Replica, our method achieves state-of-the-art surface reconstruction performance while demonstrating high efficiency, strong generalization across scenes, and practical deployability.

Technology Category

Application Category

📝 Abstract
Current multi-view 3D reconstruction methods rely on accurate camera calibration and pose estimation, requiring complex and time-intensive pre-processing that hinders their practical deployment. To address this challenge, we introduce Surf3R, an end-to-end feedforward approach that reconstructs 3D surfaces from sparse views without estimating camera poses and completes an entire scene in under 10 seconds. Our method employs a multi-branch and multi-view decoding architecture in which multiple reference views jointly guide the reconstruction process. Through the proposed branch-wise processing, cross-view attention, and inter-branch fusion, the model effectively captures complementary geometric cues without requiring camera calibration. Moreover, we introduce a D-Normal regularizer based on an explicit 3D Gaussian representation for surface reconstruction. It couples surface normals with other geometric parameters to jointly optimize the 3D geometry, significantly improving 3D consistency and surface detail accuracy. Experimental results demonstrate that Surf3R achieves state-of-the-art performance on multiple surface reconstruction metrics on ScanNet++ and Replica datasets, exhibiting excellent generalization and efficiency.
Problem

Research questions and friction points this paper is trying to address.

Reconstructs 3D surfaces from sparse RGB views without camera poses
Eliminates need for complex camera calibration and pose estimation
Improves 3D consistency and surface detail accuracy efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end feedforward sparse view reconstruction
Multi-branch cross-view attention fusion
D-Normal regularizer with Gaussian representation
🔎 Similar Papers
No similar papers found.
H
Haodong Zhu
Beihang University, China
C
Changbai Li
Beihang University, China
Y
Yangyang Ren
Beihang University, China
Z
Zichao Feng
Beihang University, China
Xuhui Liu
Xuhui Liu
Beihang University
Computer VisionAIGCFoundation Model
H
Hanlin Chen
National University of Singapore, Singapore
Xiantong Zhen
Xiantong Zhen
United Imaging
Medical Image AnalysisMachine LearningComputer Vision
Baochang Zhang
Baochang Zhang
Technische Universität München
Computer assisted interventionMedical image analysisDeep learning