Advances in Embodied Navigation Using Large Language Models: A Survey

📅 2023-11-01
📈 Citations: 8
Influential: 1
📄 PDF
🤖 AI Summary
This work addresses the lack of a systematic framework for leveraging large language models (LLMs) to enhance environmental perception and multi-step decision-making in embodied navigation. To this end, we establish the first unified research paradigm for LLM-driven embodied navigation, explicitly defining LLMs’ novel roles in cross-modal understanding, stepwise reasoning, and action planning. Our methodology integrates state-of-the-art GPT-family LLMs, vision-language models (VLMs), embodied simulation platforms (AI2-Thor and Habitat), and multimodal chain-of-thought prompting techniques, synthesizing over 100 recent studies. Key contributions include identifying critical research directions—namely, interpretability enhancement and integration with world models—and releasing Awesome-LLM-EN, an open-source, structured resource repository featuring standardized datasets, evaluation protocols, and reproducible baselines. This effort advances community-wide standardization and facilitates rigorous, comparable progress in LLM-augmented embodied intelligence.
📝 Abstract
In recent years, the rapid advancement of Large Language Models (LLMs) such as the Generative Pre-trained Transformer (GPT) has attracted increasing attention due to their potential in a variety of practical applications. The application of LLMs with Embodied Intelligence has emerged as a significant area of focus. Among the myriad applications of LLMs, navigation tasks are particularly noteworthy because they demand a deep understanding of the environment and quick, accurate decision-making. LLMs can augment embodied intelligence systems with sophisticated environmental perception and decision-making support, leveraging their robust language and image-processing capabilities. This article offers an exhaustive summary of the symbiosis between LLMs and embodied intelligence with a focus on navigation. It reviews state-of-the-art models, research methodologies, and assesses the advantages and disadvantages of existing embodied navigation models and datasets. Finally, the article elucidates the role of LLMs in embodied intelligence, based on current research, and forecasts future directions in the field. A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-EN.
Problem

Research questions and friction points this paper is trying to address.

Enhancing embodied navigation with LLMs' perception and decision-making.
Surveying LLM-embodied intelligence synergy in navigation tasks.
Evaluating current models and datasets for embodied navigation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs enhance navigation with environmental perception
Combines LLMs and embodied intelligence for decisions
Reviews models, methods, and datasets comprehensively
🔎 Similar Papers
No similar papers found.
J
Jinzhou Lin
School of Artificial Intelligence, Beijing University of Posts and Telecommunications, China
H
Han Gao
School of Artificial Intelligence, Beijing University of Posts and Telecommunications, China
Rongtao Xu
Rongtao Xu
MBZUAI << CASIA << HUST
Intelligent RobotEmbodied AIVLAVLMSpatialtemporal AI
Changwei Wang
Changwei Wang
Shandong Computer Science Center
Multimodal LearningEmbodied AIEdge Intelligent ComputingAI for HealthcareSafety Alignment
L
Li Guo
School of Artificial Intelligence, Beijing University of Posts and Telecommunications, China
Shibiao Xu
Shibiao Xu
Beijing University of Posts and Telecommunications
Computer VisionMachine LearningComputer Graphics