ELA-ZSON: Efficient Layout-Aware Zero-Shot Object Navigation Agent with Hierarchical Planning

📅 2025-05-09

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

This work addresses zero-shot object navigation in complex, multi-room indoor environments. We propose an end-to-end hierarchical navigation framework that requires no pretraining, human intervention, explicit reward engineering, or model fine-tuning. Methodologically, the framework integrates a layout-aware global topological map with a local scene memory representation, leveraging a large language model (LLM) as the core for semantic reasoning and hierarchical control. Our key contribution is the first “topology–memory–LLM” collaborative paradigm for zero-shot navigation, balancing generalization capability and deployment efficiency. On the Matterport3D (MP3D) benchmark, our approach achieves 85% success rate (SR) and 79% path-weighted success rate (SPL), surpassing prior state-of-the-art methods by over 40 percentage points in SR and 60% in SPL. Extensive validation is conducted in both simulated agents and real-world robotic platforms.

Technology Category

Application Category

📝 Abstract

We introduce ELA-ZSON, an efficient layout-aware zero-shot object navigation (ZSON) approach designed for complex multi-room indoor environments. By planning hierarchically leveraging a global topologigal map with layout information and local imperative approach with detailed scene representation memory, ELA-ZSON achieves both efficient and effective navigation. The process is managed by an LLM-powered agent, ensuring seamless effective planning and navigation, without the need for human interaction, complex rewards, or costly training. Our experimental results on the MP3D benchmark achieves 85% object navigation success rate (SR) and 79% success rate weighted by path length (SPL) (over 40% point improvement in SR and 60% improvement in SPL compared to exsisting methods). Furthermore, we validate the robustness of our approach through virtual agent and real-world robotic deployment, showcasing its capability in practical scenarios. See https://anonymous.4open.science/r/ELA-ZSON-C67E/ for details.

Problem

Research questions and friction points this paper is trying to address.

Efficient zero-shot object navigation in multi-room environments

Hierarchical planning using global and local layout information

LLM-powered agent for autonomous navigation without training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical planning with global and local maps

LLM-powered agent for autonomous navigation

Zero-shot object navigation without training

🔎 Similar Papers

No similar papers found.