🤖 AI Summary
For large-scale Structure-from-Motion (SfM) on unconstrained image collections, this work proposes the first end-to-end trainable, feed-forward 3D reconstruction framework, overcoming the efficiency bottlenecks of conventional two-stage pipelines (local feature matching followed by global bundle adjustment). Methodologically, it introduces (1) an implicit global alignment module that replaces explicit global BA with a learnable self-attention mechanism, and (2) a retrieval-score-guided shortest-path-tree construction for sparse scene graph generation, ensuring geometric consistency while drastically reducing graph complexity. By unifying deep learning, graph-structured modeling, and sparse optimization, the approach achieves state-of-the-art reconstruction accuracy on mainstream benchmarks, reduces memory consumption by 42%, accelerates runtime by 3.8×, and—crucially—enables real-time sparse 3D reconstruction of large-scale outdoor scenes under resource-constrained settings for the first time.
📝 Abstract
We present Light3R-SfM, a feed-forward, end-to-end learnable framework for efficient large-scale Structure-from-Motion (SfM) from unconstrained image collections. Unlike existing SfM solutions that rely on costly matching and global optimization to achieve accurate 3D reconstructions, Light3R-SfM addresses this limitation through a novel latent global alignment module. This module replaces traditional global optimization with a learnable attention mechanism, effectively capturing multi-view constraints across images for robust and precise camera pose estimation. Light3R-SfM constructs a sparse scene graph via retrieval-score-guided shortest path tree to dramatically reduce memory usage and computational overhead compared to the naive approach. Extensive experiments demonstrate that Light3R-SfM achieves competitive accuracy while significantly reducing runtime, making it ideal for 3D reconstruction tasks in real-world applications with a runtime constraint. This work pioneers a data-driven, feed-forward SfM approach, paving the way toward scalable, accurate, and efficient 3D reconstruction in the wild.