🤖 AI Summary
Monocular panoramic visual odometry often suffers from limited robustness under aggressive motion and illumination changes due to its reliance on handcrafted features or photometric consistency. This work proposes the first end-to-end deep learning framework for this task, introducing a Distortion-Aware Spherical Feature extractor (DAS-Feat) that adaptively handles panoramic image distortions, and a novel Omnidirectional Differentiable Bundle Adjustment (ODBA) module for high-precision pose refinement. Evaluated on both a newly curated real-world benchmark and existing synthetic datasets, the method significantly outperforms current state-of-the-art approaches, achieving a 50% improvement in robustness and a 37.5% gain in accuracy, thereby establishing the first effective application of deep learning to panoramic visual odometry.
📝 Abstract
Monocular omnidirectional visual odometry (OVO) systems leverage 360-degree cameras to overcome field-of-view limitations of perspective VO systems. However, existing methods, reliant on handcrafted features or photometric objectives, often lack robustness in challenging scenarios, such as aggressive motion and varying illumination. To address this, we present 360DVO, the first deep learning-based OVO framework. Our approach introduces a distortion-aware spherical feature extractor (DAS-Feat) that adaptively learns distortion-resistant features from 360-degree images. These sparse feature patches are then used to establish constraints for effective pose estimation within a novel omnidirectional differentiable bundle adjustment (ODBA) module. To facilitate evaluation in realistic settings, we also contribute a new real-world OVO benchmark. Extensive experiments on this benchmark and public synthetic datasets (TartanAir V2 and 360VO) demonstrate that 360DVO surpasses state-of-the-art baselines (including 360VO and OpenVSLAM), improving robustness by 50% and accuracy by 37.5%.