🤖 AI Summary
Existing visual navigation policies exhibit poor generalization due to strong dependence on specific robot embodiments (e.g., size, field of view), hindering deployment across increasingly diverse custom hardware. To address this, we propose RING—the first embodiment-agnostic indoor navigation policy. RING leverages a configurable multi-embodiment simulation environment in AI2-THOR, dynamically modeling body dimensions, camera pose, and pivot points. It employs multi-embodiment mixed training and normalized visual observation encoding to enable zero-shot cross-embodiment transfer. Experiments demonstrate that RING achieves an average success rate of 72.1% across five simulated embodiments and 78.9% on four real-world heterogeneous robots—Stretch, LoCoBot, Go1, and TurtleBot—significantly outperforming embodiment-specific baselines. This work constitutes the first empirical validation of embodiment-agnostic navigation on real heterogeneous robotic platforms, establishing its feasibility and effectiveness.
📝 Abstract
Modern robots vary significantly in shape, size, and sensor configurations used to perceive and interact with their environments. However, most navigation policies are embodiment-specific; a policy learned using one robot's configuration does not typically gracefully generalize to another. Even small changes in the body size or camera viewpoint may cause failures. With the recent surge in custom hardware developments, it is necessary to learn a single policy that can be transferred to other embodiments, eliminating the need to (re)train for each specific robot. In this paper, we introduce RING (Robotic Indoor Navigation Generalist), an embodiment-agnostic policy, trained solely in simulation with diverse randomly initialized embodiments at scale. Specifically, we augment the AI2-THOR simulator with the ability to instantiate robot embodiments with controllable configurations, varying across body size, rotation pivot point, and camera configurations. In the visual object-goal navigation task, RING achieves robust performance on real unseen robot platforms (Stretch RE-1, LoCoBot, Unitree's Go1), achieving an average of 72.1% and 78.9% success rate across 5 embodiments in simulation and 4 robot platforms in the real world. (project website: https://one-ring-policy.allen.ai/)