Reliev3R: Relieving Feed-forward Reconstruction from Multi-View Geometric Annotations

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scalability limitations of existing feedforward 3D reconstruction models, which rely on costly multi-view geometric annotations such as 3D point clouds and camera poses. The authors propose Reliev3R, a weakly supervised paradigm that extracts 3D knowledge using only monocular relative depth and sparse image correspondences, eliminating the need for multi-view labels or complex structure-from-motion preprocessing. By integrating a blur-aware relative depth loss with a triangulation-based reprojection loss, the method enforces multi-view geometric consistency, while leveraging zero-shot predictions from pretrained models to provide initial supervisory signals for end-to-end training. Experiments demonstrate that Reliev3R, trained from scratch with minimal data, achieves performance comparable to fully supervised approaches, significantly advancing low-cost and scalable 3D reconstruction.
📝 Abstract
With recent advances, Feed-forward Reconstruction Models (FFRMs) have demonstrated great potential in reconstruction quality and adaptiveness to multiple downstream tasks. However, the excessive reliance on multi-view geometric annotations, e.g. 3D point maps and camera poses, makes the fully-supervised training scheme of FFRMs difficult to scale up. In this paper, we propose Reliev3R, a weakly-supervised paradigm for training FFRMs from scratch without cost-prohibitive multi-view geometric annotations. Relieving the reliance on geometric sensory data and compute-exhaustive structure-from-motion preprocessing, our method draws 3D knowledge directly from monocular relative depths and image sparse correspondences given by zero-shot predictions of pretrained models. At the core of Reliev3R, we design an ambiguity-aware relative depth loss and a trigonometry-based reprojection loss to facilitate supervision for multi-view geometric consistency. Training from scratch with the less data, Reliev3R catches up with its fully-supervised sibling models, taking a step towards low-cost 3D reconstruction supervisions and scalable FFRMs.
Problem

Research questions and friction points this paper is trying to address.

Feed-forward Reconstruction Models
multi-view geometric annotations
weakly-supervised training
3D reconstruction
scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

weakly-supervised learning
feed-forward reconstruction
relative depth
multi-view consistency
zero-shot prediction
🔎 Similar Papers
No similar papers found.