🤖 AI Summary
Current social robot navigation evaluation relies heavily on hand-crafted rules and lacks quantifiable, standardized benchmarks. To address this, we propose the first data-driven metric for assessing social navigation quality. Our method constructs a high-quality dataset comprising 4,402 real-world and simulated navigation trajectories, each annotated with multi-round human perceptual ratings (e.g., comfort, naturalness). We then train a supervised RNN-based evaluator using these human scores as ground-truth labels. Our key contributions are: (1) releasing the first empirically grounded, fine-grained human-annotated dataset for social navigation evaluation; (2) establishing a generalizable and interpretable data-driven evaluation paradigm that facilitates navigation policy optimization and fair cross-method comparison; and (3) open-sourcing all data, code, and trained model weights to support community benchmarking and reproducibility.
📝 Abstract
This paper presents a joint effort towards the development of a data-driven Social Robot Navigation metric to facilitate benchmarking and policy optimization. We provide our motivations for our approach and describe our proposal for storing rated social navigation trajectory datasets. Following these guidelines, we compiled a dataset with 4427 trajectories -- 182 real and 4245 simulated -- and presented it to human raters, yielding a total of 4402 rated trajectories after data quality assurance. We also trained an RNN-based baseline metric on the dataset and present quantitative and qualitative results. All data, software, and model weights are publicly available.