Advances in Embodied Navigation Using Large Language Models: A Survey

📅 2023-11-01

📈 Citations: 8

✨ Influential: 1

career value

217K/year

🤖 AI Summary

This work addresses the lack of a systematic framework for leveraging large language models (LLMs) to enhance environmental perception and multi-step decision-making in embodied navigation. To this end, we establish the first unified research paradigm for LLM-driven embodied navigation, explicitly defining LLMs’ novel roles in cross-modal understanding, stepwise reasoning, and action planning. Our methodology integrates state-of-the-art GPT-family LLMs, vision-language models (VLMs), embodied simulation platforms (AI2-Thor and Habitat), and multimodal chain-of-thought prompting techniques, synthesizing over 100 recent studies. Key contributions include identifying critical research directions—namely, interpretability enhancement and integration with world models—and releasing Awesome-LLM-EN, an open-source, structured resource repository featuring standardized datasets, evaluation protocols, and reproducible baselines. This effort advances community-wide standardization and facilitates rigorous, comparable progress in LLM-augmented embodied intelligence.

📝 Abstract

In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy because they demand a deep understanding of the environment and quick, accurate decision-making. LLMs can augment embodied intelligence systems with sophisticated environmental perception and decision-making support, leveraging their robust language and image-processing capabilities. This article offers an exhaustive summary of the symbiosis between LLMs and embodied intelligence with a focus on navigation. It reviews state-of-the-art models, research methodologies, and assesses the advantages and disadvantages of existing embodied navigation models and datasets. Finally, the article elucidates the role of LLMs in embodied intelligence, based on current research, and forecasts future directions in the field. A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-EN.

Problem

Research questions and friction points this paper is trying to address.

Enhancing embodied navigation with LLMs' perception and decision-making.

Surveying LLM-embodied intelligence synergy in navigation tasks.

Evaluating current models and datasets for embodied navigation.

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs enhance navigation with environmental perception

Combines LLMs and embodied intelligence for decisions

Reviews models, methods, and datasets comprehensively

🔎 Similar Papers

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models