ELA-ZSON: Efficient Layout-Aware Zero-Shot Object Navigation Agent with Hierarchical Planning

๐Ÿ“… 2025-05-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses zero-shot object navigation in complex, multi-room indoor environments. We propose an end-to-end hierarchical navigation framework that requires no pretraining, human intervention, explicit reward engineering, or model fine-tuning. Methodologically, the framework integrates a layout-aware global topological map with a local scene memory representation, leveraging a large language model (LLM) as the core for semantic reasoning and hierarchical control. Our key contribution is the first โ€œtopologyโ€“memoryโ€“LLMโ€ collaborative paradigm for zero-shot navigation, balancing generalization capability and deployment efficiency. On the Matterport3D (MP3D) benchmark, our approach achieves 85% success rate (SR) and 79% path-weighted success rate (SPL), surpassing prior state-of-the-art methods by over 40 percentage points in SR and 60% in SPL. Extensive validation is conducted in both simulated agents and real-world robotic platforms.

Technology Category

Application Category

๐Ÿ“ Abstract
We introduce ELA-ZSON, an efficient layout-aware zero-shot object navigation (ZSON) approach designed for complex multi-room indoor environments. By planning hierarchically leveraging a global topologigal map with layout information and local imperative approach with detailed scene representation memory, ELA-ZSON achieves both efficient and effective navigation. The process is managed by an LLM-powered agent, ensuring seamless effective planning and navigation, without the need for human interaction, complex rewards, or costly training. Our experimental results on the MP3D benchmark achieves 85% object navigation success rate (SR) and 79% success rate weighted by path length (SPL) (over 40% point improvement in SR and 60% improvement in SPL compared to exsisting methods). Furthermore, we validate the robustness of our approach through virtual agent and real-world robotic deployment, showcasing its capability in practical scenarios. See https://anonymous.4open.science/r/ELA-ZSON-C67E/ for details.
Problem

Research questions and friction points this paper is trying to address.

Efficient zero-shot object navigation in multi-room environments
Hierarchical planning using global and local layout information
LLM-powered agent for autonomous navigation without training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical planning with global and local maps
LLM-powered agent for autonomous navigation
Zero-shot object navigation without training
๐Ÿ”Ž Similar Papers
No similar papers found.