EmCoop: A Framework and Benchmark for Embodied Cooperation Among LLM Agents

📅 2026-02-27

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This work addresses the lack of fine-grained benchmarks for analyzing collaboration in embodied multi-agent systems, which hinders understanding how collaboration emerges, evolves, and influences task success. We propose the first embodied collaboration benchmark that supports arbitrary numbers of agents and flexible communication topologies. Our approach introduces a two-layer architecture that decouples cognitive reasoning from embodied interaction, leveraging large language models for natural-language communication and reasoning. Systematic experiments are conducted in two scalable environments. By introducing a generalizable, process-level metric for collaboration quality—going beyond conventional task success rates—we enable, for the first time, fine-grained and reproducible analysis of collaborative dynamics and failure modes across varying team sizes and task configurations.

Technology Category

Application Category

📝 Abstract

Real-world scenarios increasingly require multiple embodied agents to collaborate in dynamic environments under embodied constraints, as many tasks exceed the capabilities of any single agent. Recent advances in large language models (LLMs) enable high-level cognitive coordination through reasoning, planning, and natural language communication. However, fine-grained analyses of how such collaboration emerges, unfolds, and contributes to task success in embodied multi-agent systems are difficult to conduct with existing benchmarks. In this paper, we introduce EmCoop, a benchmark framework for studying cooperation in LLM-based embodied multi-agent systems. Our framework separates a high-level cognitive layer from a low-level embodied interaction layer, allowing us to characterize agent cooperation through their interleaved dynamics over time. Given a cooperation-constrained embodied task, we propose generalizable, process-level metrics that diagnose collaboration quality and failure modes, beyond final task success. We instantiate our framework in two embodied environments that scale to arbitrary numbers of agents and support diverse communication topologies, and use these instantiations to demonstrate how EmCoop enables systematic analysis of cooperation dynamics across team sizes and task settings. The project web page can be found at: https://happyeureka.github.io/emcoop.

Problem

Research questions and friction points this paper is trying to address.

embodied cooperation

LLM agents

multi-agent systems

benchmark

collaboration dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

embodied cooperation

LLM agents

multi-agent systems