LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry

📅 2024-01-03
🏛️ Computer Vision and Pattern Recognition
📈 Citations: 35
Influential: 5
📄 PDF
🤖 AI Summary
Existing visual odometry (VO) methods predominantly rely on two-frame tracking, neglecting temporal context across image sequences. This limitation hinders global motion modeling and trajectory reliability estimation, leading to significant performance degradation under occlusion, dynamic objects, and low-texture conditions. To address this, we propose the first long-horizon, arbitrary-point tracking frontend that jointly exploits visual features, inter-trajectory associations, and temporal evolution cues. Our method introduces a temporal probabilistic modeling framework coupled with a learnable iterative optimization module for per-point uncertainty inference. Key components include multi-cue deep tracking, temporal Bayesian distribution updating, differentiable iterative refinement, and a dynamic anchor selection mechanism. Evaluated on mainstream VO benchmarks, our approach consistently outperforms state-of-the-art methods, achieving substantial improvements in localization robustness and accuracy—particularly in challenging occluded, dynamic, and texture-deprived scenarios.

Technology Category

Application Category

📝 Abstract
Visual odometry estimates the motion of a moving cam-era based on visual input. Existing methods, mostly focusing on two-view point tracking, often ignore the rich tempo-ral context in the image sequence, thereby overlooking the global motion patterns and providing no assessment of the full trajectory reliability. These shortcomings hinder per-formance in scenarios with occlusion, dynamic objects, and low-texture areas. To address these challenges, we present the Long-term Effective Any Point Tracking (LEAP) mod-ule. LEAP innovatively combines visual, inter-track, and temporal cues with mindfully selected anchors for dynamic track estimation. Moreover, LEAP's temporal probabilistic formulation integrates distribution updates into a learnable iterative refinement module to reason about point-wise un-certainty. Based on these traits, we develop LEAP-VO, a robust visual odometry system adept at handling occlusions and dynamic scenes. Our mindful integration showcases a novel practice by employing long-term point tracking as the front-end. Extensive experiments demonstrate that the pro-posed pipeline significantly outperforms existing baselines across various visual odometry benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Existing visual odometry methods ignore temporal context in image sequences
Current approaches overlook global motion patterns and trajectory reliability assessment
Performance is hindered in occlusion, dynamic objects, and low-texture scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines visual, inter-track, and temporal cues
Integrates probabilistic updates into iterative refinement
Employs long-term point tracking as front-end
🔎 Similar Papers
No similar papers found.
W
Weirong Chen
TU Munich, Munich Center for Machine Learning
L
Le Chen
MPI for Intelligent Systems
R
Rui Wang
Microsoft
Marc Pollefeys
Marc Pollefeys
Professor of Computer Science, ETH Zurich, and Director Spatial AI Lab, Microsoft
Computer VisionComputer GraphicsRoboticsMachine LearningAugmented Reality