🤖 AI Summary
To address instability in visual-inertial odometry (VIO) initialization, frequent failure of joint multi-parameter optimization, and insufficient robustness in feature matching for AR/VR applications, this paper proposes a tightly coupled, gyroscope-aided fast VIO method. The approach integrates structure-from-motion (SfM) with gyroscopic constraints, continuous optical flow tracking, IMU preintegration, and nonlinear bundle adjustment (BA). Key contributions include: (1) a robust visual-inertial initialization framework requiring only four frames, significantly improving startup reliability; and (2) a hybrid feature tracking mechanism synergizing optical flow and deep-learning-based descriptors, achieving optimal trade-offs among accuracy, speed, and robustness. Evaluated on multiple standard benchmarks, the method achieves state-of-the-art (SOTA) performance. Real-time validation on mobile devices demonstrates sub-centimeter translational RMS error (<1 cm) and end-to-end latency under 20 ms, confirming its suitability for high-accuracy, low-latency XR localization.
📝 Abstract
This paper presents a novel approach to Visual Inertial Odometry (VIO), focusing on the initialization and feature matching modules. Existing methods for initialization often suffer from either poor stability in visual Structure from Motion (SfM) or fragility in solving a huge number of parameters simultaneously. To address these challenges, we propose a new pipeline for visual inertial initialization that robustly handles various complex scenarios. By tightly coupling gyroscope measurements, we enhance the robustness and accuracy of visual SfM. Our method demonstrates stable performance even with only four image frames, yielding competitive results. In terms of feature matching, we introduce a hybrid method that combines optical flow and descriptor-based matching. By leveraging the robustness of continuous optical flow tracking and the accuracy of descriptor matching, our approach achieves efficient, accurate, and robust tracking results. Through evaluation on multiple benchmarks, our method demonstrates state-of-the-art performance in terms of accuracy and success rate. Additionally, a video demonstration on mobile devices showcases the practical applicability of our approach in the field of Augmented Reality/Virtual Reality (AR/VR).