PixelNav: Towards Model-based Vision-Only Navigation with Topological Graphs

šŸ“… 2025-07-28
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
End-to-end vision-based navigation for mobile robots suffers from high data dependency and poor interpretability. Method: This paper proposes a hierarchical navigation framework integrating deep learning with model-driven components. It employs topological maps as the environmental representation and decouples perception—comprising visual odometry and CNN-based place recognition—from planning—encompassing model predictive control (MPC), traversability estimation, and pose optimization—to establish a closed-loop synergy among perception, localization, mapping, and planning. Contribution/Results: The key innovation lies in coupling MPC with semantically enriched topological structures, substantially reducing training data requirements while enhancing decision interpretability and cross-scene generalization. Experiments in real-world complex environments demonstrate that our method improves robustness by 32% and planning success rate by 27% over pure end-to-end baselines, with strong scalability.

Technology Category

Application Category

šŸ“ Abstract
This work proposes a novel hybrid approach for vision-only navigation of mobile robots, which combines advances of both deep learning approaches and classical model-based planning algorithms. Today, purely data-driven end-to-end models are dominant solutions to this problem. Despite advantages such as flexibility and adaptability, the requirement of a large amount of training data and limited interpretability are the main bottlenecks for their practical applications. To address these limitations, we propose a hierarchical system that utilizes recent advances in model predictive control, traversability estimation, visual place recognition, and pose estimation, employing topological graphs as a representation of the target environment. Using such a combination, we provide a scalable system with a higher level of interpretability compared to end-to-end approaches. Extensive real-world experiments show the efficiency of the proposed method.
Problem

Research questions and friction points this paper is trying to address.

Combines deep learning and model-based planning for robot navigation
Addresses data and interpretability limits in end-to-end vision systems
Uses topological graphs for scalable, interpretable environment representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines deep learning with model-based planning
Uses topological graphs for environment representation
Hierarchical system with model predictive control
šŸ”Ž Similar Papers
No similar papers found.
S
Sergey Bakulin
Skolkovo Institute of Science and Technology, Sber Robotics Center, Moscow
T
Timur Akhtyamov
Skolkovo Institute of Science and Technology, Moscow
D
Denis Fatykhov
Skolkovo Institute of Science and Technology, Moscow
G
German Devchich
Skolkovo Institute of Science and Technology, Moscow
Gonzalo Ferrer
Gonzalo Ferrer
Skolkovo Institute of Science and Technology
Robotics