Data Scaling for Navigation in Unknown Environments

📅 2026-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the poor zero-shot generalization of end-to-end visual navigation policies in unseen environments by systematically evaluating, through large-scale empirical analysis, how the scale and geographic diversity of training data affect the performance of map-free point-goal navigation strategies. Leveraging a crowdsourced video dataset spanning 161 locations across 35 countries (totaling 4,565 hours), the authors train navigation policies and evaluate their closed-loop control performance on 125 kilometers of real-world roads across four countries. The work reveals, for the first time, that geographic diversity is far more critical than total data volume; under noisy crowdsourced data, simple regression models outperform complex architectures; and performance approaching that of environment-specific training can be achieved solely through data diversity, with increasing the number of distinct locations reducing navigation error by approximately 15%.

Technology Category

Application Category

📝 Abstract
Generalization of imitation-learned navigation policies to environments unseen in training remains a major challenge. We address this by conducting the first large-scale study of how data quantity and data diversity affect real-world generalization in end-to-end, map-free visual navigation. Using a curated 4,565-hour crowd-sourced dataset collected across 161 locations in 35 countries, we train policies for point goal navigation and evaluate their closed-loop control performance on sidewalk robots operating in four countries, covering 125 km of autonomous driving. Our results show that large-scale training data enables zero-shot navigation in unknown environments, approaching the performance of policies trained with environment-specific demonstrations. Critically, we find that data diversity is far more important than data quantity. Doubling the number of geographical locations in a training set decreases navigation errors by ~15%, while performance benefit from adding data from existing locations saturates with very little data. We also observe that, with noisy crowd-sourced data, simple regression-based models outperform generative and sequence-based architectures. We release our policies, evaluation setup and example videos on the project page.
Problem

Research questions and friction points this paper is trying to address.

navigation generalization
imitation learning
data diversity
visual navigation
unknown environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

data diversity
zero-shot navigation
imitation learning
visual navigation
large-scale dataset
🔎 Similar Papers
No similar papers found.
L
Lauri Suomela
Tampere University
N
Naoki Takahata
Tohoku University
S
Sasanka Kuruppu Arachchige
Tampere University
H
Harry Edelman
Turku University of Applied Sciences
Joni-Kristian Kämäräinen
Joni-Kristian Kämäräinen
Professor of Signal Processing, Tampere University
Computer VisionRobot VisionRobot LearningRobotics