Prompting Robot Teams with Natural Language

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Specifying high-level tasks in multi-robot systems via natural language remains challenging due to the lack of interpretability in individual robot behaviors and the need for continuous adaptation to peer agents’ actions. Method: We propose a language-driven decentralized coordination framework: (i) a large language model parses natural language instructions into deterministic finite automata (DFAs) encoding task logic; (ii) recurrent neural networks (RNNs) track DFA states; and (iii) graph neural networks (GNNs) synthesize distributed control policies from local robot observations. Contribution/Results: This work establishes, for the first time, an end-to-end alignment from natural language → semantic automaton → distributed reasoning. Evaluated in both simulation and real-robot experiments, our lightweight single-model architecture demonstrates efficient, robust, and interpretable execution of complex temporal collaborative tasks.

Technology Category

Application Category

📝 Abstract

This paper presents a framework towards prompting multi-robot teams with high-level tasks using natural language expressions. Our objective is to use the reasoning capabilities demonstrated by recent language models in understanding and decomposing human expressions of intent, and repurpose these for multi-robot collaboration and decision-making. The key challenge is that an individual's behavior in a collective can be hard to specify and interpret, and must continuously adapt to actions from others. This necessitates a framework that possesses the representational capacity required by the logic and semantics of a task, and yet supports decentralized and interactive real-time operation. We solve this dilemma by recognizing that a task can be represented as a deterministic finite automaton (DFA), and that recurrent neural networks (RNNs) can encode numerous automata. This allows us to distill the logic and sequential decompositions of sub-tasks obtained from a language model into an RNN, and align its internal states with the semantics of a given task. By training a graph neural network (GNN) control policy that is conditioned on the hidden states of the RNN and the language embeddings, our method enables robots to execute task-relevant actions in a decentralized manner. We present evaluations of this single light-weight interpretable model on various simulated and real-world multi-robot tasks that require sequential and collaborative behavior by the team -- sites.google.com/view/prompting-teams.

Problem

Research questions and friction points this paper is trying to address.

Enabling natural language task specification for multi-robot teams

Decentralized execution of sequential collaborative robot behaviors

Translating language model reasoning into robot control policies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Natural language prompts multi-robot teams

RNN encodes task logic from language model

GNN enables decentralized robot control

🔎 Similar Papers

A Survey of Language-Based Communication in Robotics