🤖 AI Summary
Task-oriented navigation in unstructured, unknown environments requires real-time construction of metric-semantic maps that jointly encode rich semantics and geometric precision to enable cross-task generalization.
Method: We propose a language-embedded Gaussian splatting hierarchical mapping framework. Our approach pioneers the integration of natural language embeddings into Gaussian splatting representations, unifying semantic understanding, sparse high-level planning, and dense geometric collision avoidance. It incorporates active perception–driven incremental map updating and an end-to-end task reasoning–motion planning closed loop.
Results: Evaluated on real indoor and outdoor robotic platforms—including cluttered indoor spaces and kilometer-scale outdoor scenes—the system supports dynamic task clarification and redefinition while maintaining real-time operation. It achieves 60% of the success rate of privileged baselines, demonstrating substantial improvement in open-world generalization for navigation tasks.
📝 Abstract
We address the challenge of task-oriented navigation in unstructured and unknown environments, where robots must incrementally build and reason on rich, metric-semantic maps in real time. Since tasks may require clarification or re-specification, it is necessary for the information in the map to be rich enough to enable generalization across a wide range of tasks. To effectively execute tasks specified in natural language, we propose a hierarchical representation built on language-embedded Gaussian splatting that enables both sparse semantic planning that lends itself to online operation and dense geometric representation for collision-free navigation. We validate the effectiveness of our method through real-world robot experiments conducted in both cluttered indoor and kilometer-scale outdoor environments, with a competitive ratio of about 60% against privileged baselines. Experiment videos and more details can be found on our project page: https://atlasnav.github.io