Light3R-SfM: Towards Feed-forward Structure-from-Motion

📅 2025-01-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
For large-scale Structure-from-Motion (SfM) on unconstrained image collections, this work proposes the first end-to-end trainable, feed-forward 3D reconstruction framework, overcoming the efficiency bottlenecks of conventional two-stage pipelines (local feature matching followed by global bundle adjustment). Methodologically, it introduces (1) an implicit global alignment module that replaces explicit global BA with a learnable self-attention mechanism, and (2) a retrieval-score-guided shortest-path-tree construction for sparse scene graph generation, ensuring geometric consistency while drastically reducing graph complexity. By unifying deep learning, graph-structured modeling, and sparse optimization, the approach achieves state-of-the-art reconstruction accuracy on mainstream benchmarks, reduces memory consumption by 42%, accelerates runtime by 3.8×, and—crucially—enables real-time sparse 3D reconstruction of large-scale outdoor scenes under resource-constrained settings for the first time.

Technology Category

Application Category

📝 Abstract
We present Light3R-SfM, a feed-forward, end-to-end learnable framework for efficient large-scale Structure-from-Motion (SfM) from unconstrained image collections. Unlike existing SfM solutions that rely on costly matching and global optimization to achieve accurate 3D reconstructions, Light3R-SfM addresses this limitation through a novel latent global alignment module. This module replaces traditional global optimization with a learnable attention mechanism, effectively capturing multi-view constraints across images for robust and precise camera pose estimation. Light3R-SfM constructs a sparse scene graph via retrieval-score-guided shortest path tree to dramatically reduce memory usage and computational overhead compared to the naive approach. Extensive experiments demonstrate that Light3R-SfM achieves competitive accuracy while significantly reducing runtime, making it ideal for 3D reconstruction tasks in real-world applications with a runtime constraint. This work pioneers a data-driven, feed-forward SfM approach, paving the way toward scalable, accurate, and efficient 3D reconstruction in the wild.
Problem

Research questions and friction points this paper is trying to address.

3D reconstruction
image processing
computational efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Light3R-SfM
Potential Global Alignment Module
Data-Driven Process
🔎 Similar Papers
No similar papers found.