Light3R-SfM: Towards Feed-forward Structure-from-Motion

📅 2025-01-24

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

For large-scale Structure-from-Motion (SfM) on unconstrained image collections, this work proposes the first end-to-end trainable, feed-forward 3D reconstruction framework, overcoming the efficiency bottlenecks of conventional two-stage pipelines (local feature matching followed by global bundle adjustment). Methodologically, it introduces (1) an implicit global alignment module that replaces explicit global BA with a learnable self-attention mechanism, and (2) a retrieval-score-guided shortest-path-tree construction for sparse scene graph generation, ensuring geometric consistency while drastically reducing graph complexity. By unifying deep learning, graph-structured modeling, and sparse optimization, the approach achieves state-of-the-art reconstruction accuracy on mainstream benchmarks, reduces memory consumption by 42%, accelerates runtime by 3.8×, and—crucially—enables real-time sparse 3D reconstruction of large-scale outdoor scenes under resource-constrained settings for the first time.

Technology Category

Application Category

📝 Abstract

We present Light3R-SfM, a feed-forward, end-to-end learnable framework for efficient large-scale Structure-from-Motion (SfM) from unconstrained image collections. Unlike existing SfM solutions that rely on costly matching and global optimization to achieve accurate 3D reconstructions, Light3R-SfM addresses this limitation through a novel latent global alignment module. This module replaces traditional global optimization with a learnable attention mechanism, effectively capturing multi-view constraints across images for robust and precise camera pose estimation. Light3R-SfM constructs a sparse scene graph via retrieval-score-guided shortest path tree to dramatically reduce memory usage and computational overhead compared to the naive approach. Extensive experiments demonstrate that Light3R-SfM achieves competitive accuracy while significantly reducing runtime, making it ideal for 3D reconstruction tasks in real-world applications with a runtime constraint. This work pioneers a data-driven, feed-forward SfM approach, paving the way toward scalable, accurate, and efficient 3D reconstruction in the wild.

Problem

Research questions and friction points this paper is trying to address.

3D reconstruction

image processing

computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Light3R-SfM

Potential Global Alignment Module

Data-Driven Process

🔎 Similar Papers

No similar papers found.