Inside Knowledge: Graph-based Path Generation with Explainable Data Augmentation and Curriculum Learning for Visual Indoor Navigation

📅 2025-08-15

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

To address navigation challenges caused by indoor GPS failure, this paper proposes a purely vision-based, end-to-end real-time heading prediction method. Methodologically, we design a graph neural network (GNN)-based trajectory modeling framework, integrated with interpretable data augmentation and curriculum learning strategies, enabling automatic multi-target heading annotation solely from video frame sequences captured by mobile devices—eliminating reliance on maps, auxiliary sensors, or manual labeling. To our knowledge, this is the first work to achieve indoor navigation heading estimation exclusively from monocular video input. Evaluated on a large-scale real-world video dataset collected in shopping malls, our Android application achieves high accuracy and ultra-low inference latency (<30 ms). All source code, the annotated dataset, and interactive visualizations are publicly released.

Technology Category

Application Category

📝 Abstract

Indoor navigation is a difficult task, as it generally comes with poor GPS access, forcing solutions to rely on other sources of information. While significant progress continues to be made in this area, deployment to production applications is still lacking, given the complexity and additional requirements of current solutions. Here, we introduce an efficient, real-time and easily deployable deep learning approach, based on visual input only, that can predict the direction towards a target from images captured by a mobile device. Our technical approach, based on a novel graph-based path generation method, combined with explainable data augmentation and curriculum learning, includes contributions that make the process of data collection, annotation and training, as automatic as possible, efficient and robust. On the practical side, we introduce a novel largescale dataset, with video footage inside a relatively large shopping mall, in which each frame is annotated with the correct next direction towards different specific target destinations. Different from current methods, ours relies solely on vision, avoiding the need of special sensors, additional markers placed along the path, knowledge of the scene map or internet access. We also created an easy to use application for Android, which we plan to make publicly available. We make all our data and code available along with visual demos on our project site

Problem

Research questions and friction points this paper is trying to address.

Indoor navigation without GPS or special sensors

Automatic data collection and training for navigation

Real-time vision-based direction prediction for mobile devices

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based path generation for navigation

Explainable data augmentation technique

Curriculum learning for efficient training

🔎 Similar Papers

A Landmark-Aware Visual Navigation Dataset