🤖 AI Summary
This work addresses the limitations of conventional image retrieval–based pose graph initialization in Structure-from-Motion (SfM), which processes image pairs independently and lacks global consistency, leading to degraded reconstruction accuracy in sparse or highly dynamic scenes. To overcome this, the authors propose a globally consistent edge prioritization mechanism that leverages a graph neural network (GNN) trained with SfM supervision to predict the global reliability of edges. The initial pose graph is constructed via multiple minimum spanning trees, augmented by a connectivity-aware score modulation strategy to refine graph structure. By incorporating global contextual information into edge prioritization for the first time, the method significantly enhances reconstruction robustness and accuracy under challenging conditions—particularly in sparse, high-motion, or blurry scenarios—yielding more reliable and compact pose graphs than current state-of-the-art approaches.
📝 Abstract
The pose graph is a core component of Structure-from-Motion (SfM), where images act as nodes and edges encode relative poses. Since geometric verification is expensive, SfM pipelines restrict the pose graph to a sparse set of candidate edges, making initialization critical. Existing methods rely on image retrieval to connect each image to its $k$ nearest neighbors, treating pairs independently and ignoring global consistency. We address this limitation through the concept of edge prioritization, ranking candidate edges by their utility for SfM. Our approach has three components: (1) a GNN trained with SfM-derived supervision to predict globally consistent edge reliability; (2) multi-minimal-spanning-tree-based pose graph construction guided by these ranks; and (3) connectivity-aware score modulation that reinforces weak regions and reduces graph diameter. This globally informed initialization yields more reliable and compact pose graphs, improving reconstruction accuracy in sparse and high-speed settings and outperforming SOTA retrieval methods on ambiguous scenes. The ode and trained models are available at https://github.com/weitong8591/global_edge_prior.