🤖 AI Summary
This paper addresses the ambiguous functional positioning and fragmented application paradigms of language models in robotic systems. We propose the first comprehensive framework that systematically classifies the functional roles of language within robotic control flows—namely, human-to-robot instruction input, robot state feedback, inter-robot collaboration medium, and internal planning tool. Methodologically, our approach integrates large language models with multimodal perception, embodied reasoning, and robot control interfaces to enable cross-modal interaction—including text, speech, vision, and action generation. Key contributions include: (1) the first formal characterization of language’s structural role in human–robot–robot multi-party communication; (2) identification of four critical technical gaps—scalability, real-time responsiveness, causal understanding, and safety alignment; and (3) a systematic roadmap for advancing language-augmented embodied intelligence.
📝 Abstract
Embodied robots which can interact with their environment and neighbours are increasingly being used as a test case to develop Artificial Intelligence. This creates a need for multimodal robot controllers that can operate across different types of information, including text. Large Language Models are able to process and generate textual as well as audiovisual data and, more recently, robot actions. Language Models are increasingly being applied to robotic systems; these Language-Based robots leverage the power of language models in a variety of ways. Additionally, the use of language opens up multiple forms of information exchange between members of a human-robot team. This survey motivates the use of language models in robotics, and then delineates works based on the part of the overall control flow in which language is incorporated. Language can be used by human to task a robot, by a robot to inform a human, between robots as a human-like communication medium, and internally for a robot's planning and control. Applications of language-based robots are explored, and numerous limitations and challenges are discussed to provide a summary of the development needed for the future of language-based robotics.