🤖 AI Summary
Learning-based stereo visual odometry (VO) suffers from insufficient robustness in complex environments—such as varying illumination, low-texture regions, and rapid motion—particularly during keypoint selection and pose graph optimization (PGO).
Method: This paper proposes MAC-VO, a metric-aware co-optimization framework integrating deep feature matching, uncertainty estimation, and binocular geometric constraints. Its core innovation is the first metric-aware covariance model that jointly characterizes spatial registration error and multi-axis correlations; keypoint selection is guided by global inconsistency, and matching residuals are weighted accordingly—overcoming traditional texture-driven heuristics and isotropic assumptions.
Results: On mainstream benchmarks, MAC-VO significantly outperforms existing VO and several SLAM methods, especially under challenging conditions. It produces interpretable covariance maps that quantify pose reliability, enabling downstream autonomous decision-making.
📝 Abstract
We propose the MAC-VO, a novel learning-based stereo VO that leverages the learned metrics-aware matching uncertainty for dual purposes: selecting keypoint and weighing the residual in pose graph optimization. Compared to traditional geometric methods prioritizing texture-affluent features like edges, our keypoint selector employs the learned uncertainty to filter out the low-quality features based on global inconsistency. In contrast to the learning-based algorithms that model the scale-agnostic diagonal weight matrix for covariance, we design a metrics-aware covariance model to capture the spatial error during keypoint registration and the correlations between different axes. Integrating this covariance model into pose graph optimization enhances the robustness and reliability of pose estimation, particularly in challenging environments with varying illumination, feature density, and motion patterns. On public benchmark datasets, MAC-VO outperforms existing VO algorithms and even some SLAM algorithms in challenging environments. The covariance map also provides valuable information about the reliability of the estimated poses, which can benefit decision-making for autonomous systems.