The One RING: a Robotic Indoor Navigation Generalist

📅 2024-12-18

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing visual navigation policies exhibit poor generalization due to strong dependence on specific robot embodiments (e.g., size, field of view), hindering deployment across increasingly diverse custom hardware. To address this, we propose RING—the first embodiment-agnostic indoor navigation policy. RING leverages a configurable multi-embodiment simulation environment in AI2-THOR, dynamically modeling body dimensions, camera pose, and pivot points. It employs multi-embodiment mixed training and normalized visual observation encoding to enable zero-shot cross-embodiment transfer. Experiments demonstrate that RING achieves an average success rate of 72.1% across five simulated embodiments and 78.9% on four real-world heterogeneous robots—Stretch, LoCoBot, Go1, and TurtleBot—significantly outperforming embodiment-specific baselines. This work constitutes the first empirical validation of embodiment-agnostic navigation on real heterogeneous robotic platforms, establishing its feasibility and effectiveness.

Technology Category

Application Category

📝 Abstract

Modern robots vary significantly in shape, size, and sensor configurations used to perceive and interact with their environments. However, most navigation policies are embodiment-specific; a policy learned using one robot's configuration does not typically gracefully generalize to another. Even small changes in the body size or camera viewpoint may cause failures. With the recent surge in custom hardware developments, it is necessary to learn a single policy that can be transferred to other embodiments, eliminating the need to (re)train for each specific robot. In this paper, we introduce RING (Robotic Indoor Navigation Generalist), an embodiment-agnostic policy, trained solely in simulation with diverse randomly initialized embodiments at scale. Specifically, we augment the AI2-THOR simulator with the ability to instantiate robot embodiments with controllable configurations, varying across body size, rotation pivot point, and camera configurations. In the visual object-goal navigation task, RING achieves robust performance on real unseen robot platforms (Stretch RE-1, LoCoBot, Unitree's Go1), achieving an average of 72.1% and 78.9% success rate across 5 embodiments in simulation and 4 robot platforms in the real world. (project website: https://one-ring-policy.allen.ai/)

Problem

Research questions and friction points this paper is trying to address.

Generalizing navigation policies across diverse robot embodiments

Eliminating need for retraining policies per specific robot hardware

Enabling robust indoor semantic navigation for varied mobile robots

Innovation

Methods, ideas, or system contributions that make the work stand out.

Embodiment-agnostic policy for diverse robots

Large-scale simulation training with randomization

Augmented simulator for varied robot configurations

🔎 Similar Papers

Autonomous Exploration and Semantic Updating of Large-Scale Indoor Environments with Mobile Robots