Data Scaling for Navigation in Unknown Environments

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This study addresses the poor zero-shot generalization of end-to-end visual navigation policies in unseen environments by systematically evaluating, through large-scale empirical analysis, how the scale and geographic diversity of training data affect the performance of map-free point-goal navigation strategies. Leveraging a crowdsourced video dataset spanning 161 locations across 35 countries (totaling 4,565 hours), the authors train navigation policies and evaluate their closed-loop control performance on 125 kilometers of real-world roads across four countries. The work reveals, for the first time, that geographic diversity is far more critical than total data volume; under noisy crowdsourced data, simple regression models outperform complex architectures; and performance approaching that of environment-specific training can be achieved solely through data diversity, with increasing the number of distinct locations reducing navigation error by approximately 15%.

Technology Category

Application Category

📝 Abstract

Generalization of imitation-learned navigation policies to environments unseen in training remains a major challenge. We address this by conducting the first large-scale study of how data quantity and data diversity affect real-world generalization in end-to-end, map-free visual navigation. Using a curated 4,565-hour crowd-sourced dataset collected across 161 locations in 35 countries, we train policies for point goal navigation and evaluate their closed-loop control performance on sidewalk robots operating in four countries, covering 125 km of autonomous driving. Our results show that large-scale training data enables zero-shot navigation in unknown environments, approaching the performance of policies trained with environment-specific demonstrations. Critically, we find that data diversity is far more important than data quantity. Doubling the number of geographical locations in a training set decreases navigation errors by ~15%, while performance benefit from adding data from existing locations saturates with very little data. We also observe that, with noisy crowd-sourced data, simple regression-based models outperform generative and sequence-based architectures. We release our policies, evaluation setup and example videos on the project page.

Problem

Research questions and friction points this paper is trying to address.

navigation generalization

imitation learning

data diversity

visual navigation

unknown environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

data diversity

zero-shot navigation

imitation learning