Improving Cooperation in Collaborative Embodied AI

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses key challenges in collaborative embodied AI—namely, low multi-agent cooperation efficiency, insufficient LLM-driven communication and reasoning, and poor naturalness of human–machine interaction—by proposing an enhanced CoELA framework. Methodologically, it introduces a hierarchical prompt engineering strategy tailored for task coordination, integrating role-aware agent instructions, shared memory mechanisms, and dynamic context management; notably, it pioneers end-to-end speech interaction (ASR/TTS) to enable real-time, voice-driven collaborative decision-making. Experiments atop open-source LLMs (e.g., Gemma-3) demonstrate a 22% improvement in task completion efficiency over the original CoELA, alongside显著 gains in user immersion and system iteration speed. Core contributions include: (1) a scalable, LLM-based multi-agent prompting paradigm; (2) a speech-augmented embodied collaboration architecture; and (3) lightweight deployment validation in realistic scenarios.

Technology Category

Application Category

📝 Abstract
The integration of Large Language Models (LLMs) into multiagent systems has opened new possibilities for collaborative reasoning and cooperation with AI agents. This paper explores different prompting methods and evaluates their effectiveness in enhancing agent collaborative behaviour and decision-making. We enhance CoELA, a framework designed for building Collaborative Embodied Agents that leverage LLMs for multi-agent communication, reasoning, and task coordination in shared virtual spaces. Through systematic experimentation, we examine different LLMs and prompt engineering strategies to identify optimised combinations that maximise collaboration performance. Furthermore, we extend our research by integrating speech capabilities, enabling seamless collaborative voice-based interactions. Our findings highlight the effectiveness of prompt optimisation in enhancing collaborative agent performance; for example, our best combination improved the efficiency of the system running with Gemma3 by 22% compared to the original CoELA system. In addition, the speech integration provides a more engaging user interface for iterative system development and demonstrations.
Problem

Research questions and friction points this paper is trying to address.

Enhancing collaborative behavior in multi-agent AI systems through prompting
Optimizing LLM integration for embodied agent communication and coordination
Improving task efficiency in shared virtual environments using speech capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced CoELA framework for collaborative embodied agents
Optimized prompting methods to improve multi-agent cooperation
Integrated speech capabilities for voice-based agent interactions
🔎 Similar Papers
No similar papers found.
H
Hima Jacob Leven Suprabha
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
L
Laxmi Nag Laxminarayan Nagesh
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
A
Ajith Nair
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
A
Alvin Reuben Amal Selvaster
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
A
Ayan Khan
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
R
Raghuram Damarla
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
S
Sanju Hannah Samuel
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
S
Sreenithi Saravana Perumal
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
T
Titouan Puech
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
V
Venkataramireddy Marella
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
V
Vishal Sonar
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK
Alessandro Suglia
Alessandro Suglia
Assistant Professor, Heriot-Watt University, Edinburgh Centre for Robotics, National Robotarium
Multimodal Generative AIEmbodied AIConversational AI
Oliver Lemon
Oliver Lemon
Professor; Academic Lead National Robotarium; Director of Interaction Lab, Edinburgh
Conversational AISpoken Language UnderstandingSpoken Dialog SystemsDialog SystemsHuman-Robot