🤖 AI Summary
Existing methods (e.g., NoPe-NeRF) model only local image relationships, limiting their ability to accurately recover camera poses in complex scenes without pose priors. To address this, we propose a local-to-global NeRF pose optimization framework: (1) relative pose initialization via feature trajectories; (2) joint optimization of local radiance fields and camera poses; and (3) explicit feature-matching constraints to drive geometrically consistent global bundle adjustment. This is the first work to seamlessly integrate local appearance representations with global geometric cues within NeRF, enabling end-to-end, pose-prior-free robust reconstruction. Extensive experiments on multiple benchmarks demonstrate significant improvements in camera pose estimation accuracy and novel-view synthesis fidelity—particularly under challenging conditions including dynamic occlusions, textureless regions, and large-baseline configurations.
📝 Abstract
In this paper, we introduce NoPe-NeRF++, a novel local-to-global optimization algorithm for training Neural Radiance Fields (NeRF) without requiring pose priors. Existing methods, particularly NoPe-NeRF, which focus solely on the local relationships within images, often struggle to recover accurate camera poses in complex scenarios. To overcome the challenges, our approach begins with a relative pose initialization with explicit feature matching, followed by a local joint optimization to enhance the pose estimation for training a more robust NeRF representation. This method significantly improves the quality of initial poses. Additionally, we introduce global optimization phase that incorporates geometric consistency constraints through bundle adjustment, which integrates feature trajectories to further refine poses and collectively boost the quality of NeRF. Notably, our method is the first work that seamlessly combines the local and global cues with NeRF, and outperforms state-of-the-art methods in both pose estimation accuracy and novel view synthesis. Extensive evaluations on benchmark datasets demonstrate our superior performance and robustness, even in challenging scenes, thus validating our design choices.