EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval

📅 2025-10-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address three key challenges in edge-device zero-shot object navigation (ObjNav)—large-model deployment difficulty, complex map understanding, and high planning latency—this paper proposes a lightweight end-to-end framework. First, we introduce a semantic-aware memory retrieval mechanism that dynamically compresses semantic information from navigation graphs. Second, we design a discrete key-value caching scheme coupled with attention-based clustering and token reuse to significantly accelerate Transformer inference. To our knowledge, this is the first work enabling high-performance zero-shot ObjNav on local devices using a small-scale LLM (e.g., LLaMA3.2-11B) without cloud dependency. On the HM3D benchmark, our method achieves a 11.1% absolute success rate improvement over GPT-4, reduces real-time inference latency by 6.7×, and cuts end-to-end latency by 4.7×—demonstrating superior accuracy, speed, and deployability.

Technology Category

Application Category

📝 Abstract
Object-goal navigation (ObjNav) tasks an agent with navigating to the location of a specific object in an unseen environment. Embodied agents equipped with large language models (LLMs) and online constructed navigation maps can perform ObjNav in a zero-shot manner. However, existing agents heavily rely on giant LLMs on the cloud, e.g., GPT-4, while directly switching to small LLMs, e.g., LLaMA3.2-11b, suffer from significant success rate drops due to limited model capacity for understanding complex navigation maps, which prevents deploying ObjNav on local devices. At the same time, the long prompt introduced by the navigation map description will cause high planning latency on local devices. In this paper, we propose EfficientNav to enable on-device efficient LLM-based zero-shot ObjNav. To help the smaller LLMs better understand the environment, we propose semantics-aware memory retrieval to prune redundant information in navigation maps. To reduce planning latency, we propose discrete memory caching and attention-based memory clustering to efficiently save and re-use the KV cache. Extensive experimental results demonstrate that EfficientNav achieves 11.1% improvement in success rate on HM3D benchmark over GPT-4-based baselines, and demonstrates 6.7x real-time latency reduction and 4.7x end-to-end latency reduction over GPT-4 planner. Our code will be released soon.
Problem

Research questions and friction points this paper is trying to address.

Enabling on-device object navigation using efficient small language models
Reducing planning latency through navigation map caching and retrieval
Improving small models' environment understanding via semantics-aware memory pruning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantics-aware memory retrieval prunes redundant navigation map information
Discrete memory caching efficiently saves and reuses KV cache
Attention-based memory clustering reduces planning latency for on-device navigation
🔎 Similar Papers
No similar papers found.
Zebin Yang
Zebin Yang
Peking University
Efficient AI
S
Sunjian Zheng
Shenzhen Institute of Artificial Intelligence and Robotics for Society; School of Computer Science and Engineering, South China University of Technology
Tong Xie
Tong Xie
Green Dynamics & University of New South Wales
Solar CellsLarge Language ModelsCheminformaticsNano Materials
T
Tianshi Xu
Institute for Artificial Intelligence, Peking University; School of Integrated Circuits, Peking University
B
Bo Yu
Shenzhen Institute of Artificial Intelligence and Robotics for Society
F
Fan Wang
Shenzhen Institute of Artificial Intelligence and Robotics for Society
Jie Tang
Jie Tang
UW Madison
Computed Tomography
Shaoshan Liu
Shaoshan Liu
PerceptIn
Embodied AIAutonomous Machine ComputingComputer SystemsTechnology Policy
M
Meng Li
Institute for Artificial Intelligence, Peking University; School of Integrated Circuits, Peking University; Beijing Advanced Innovation Center for Integrated Circuits