AMB3R: Accurate Feed-forward Metric-scale 3D Reconstruction with Backend

๐Ÿ“… 2025-11-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Addressing the challenge of achieving high-accuracy, metric-scale-consistent dense 3D reconstruction while maintaining generalizability and task scalability, this paper proposes a feed-forward multi-view reconstruction framework. The method employs a compact voxelized scene representation as a unified backend, jointly optimizing depth estimation, multi-view stereo matching, and geometric priors in an end-to-end mannerโ€”without test-time fine-tuning or pose refinement. Its core contribution is the first demonstration of direct model generalization across uncalibrated visual odometry (VO) and large-scale structure-from-motion (SfM) tasks, overcoming the task-specific dependency inherent in conventional point-cloud-based approaches. Quantitatively, the method surpasses optimization-based SLAM/SfM systems across camera pose accuracy, depth estimation error, and metric-scale 3D reconstruction quality, establishing new state-of-the-art performance.

Technology Category

Application Category

๐Ÿ“ Abstract
We present AMB3R, a multi-view feed-forward model for dense 3D reconstruction on a metric-scale that addresses diverse 3D vision tasks. The key idea is to leverage a sparse, yet compact, volumetric scene representation as our backend, enabling geometric reasoning with spatial compactness. Although trained solely for multi-view reconstruction, we demonstrate that AMB3R can be seamlessly extended to uncalibrated visual odometry (online) or large-scale structure from motion without the need for task-specific fine-tuning or test-time optimization. Compared to prior pointmap-based models, our approach achieves state-of-the-art performance in camera pose, depth, and metric-scale estimation, 3D reconstruction, and even surpasses optimization-based SLAM and SfM methods with dense reconstruction priors on common benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Develops metric-scale 3D reconstruction from multi-view images
Extends to visual odometry without task-specific fine-tuning
Surpasses optimization-based methods in camera pose and depth estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Feed-forward volumetric reconstruction with compact backend
Seamless extension to odometry without fine-tuning
Outperforms optimization-based SLAM in metric reconstruction
๐Ÿ”Ž Similar Papers
No similar papers found.