Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning

šŸ“… 2025-04-24
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
This paper addresses the critical inefficiency of large language model (LLM)-based agents in embodied multi-agent collaboration—particularly a 15% performance degradation under explicit task planning requirements. To tackle this, we propose MINDcraft, a framework featuring a dynamic, natural-language-based interaction protocol, tightly integrated with the Minecraft embodied environment, a modular multi-agent architecture, and dedicated collaboration metrics—moving beyond conventional in-context and imitation learning paradigms. Our empirical analysis, the first of its kind, identifies inefficient inter-agent communication as the primary bottleneck in current LLM-driven embodied collaboration and demonstrates that over-specification in task plans actively impairs coordination. Complementing MINDcraft, we introduce MineCollab—the first standardized benchmark for embodied multi-agent collaboration—designed to evaluate adaptive, open-world reasoning and cooperative problem-solving. Together, MINDcraft and MineCollab advance rigorous, reproducible research on scalable, context-aware collaborative intelligence.

Technology Category

Application Category

šŸ“ Abstract
Collaboration is ubiquitous and essential in day-to-day life -- from exchanging ideas, to delegating tasks, to generating plans together. This work studies how LLMs can adaptively collaborate to perform complex embodied reasoning tasks. To this end we introduce MINDcraft, an easily extensible platform built to enable LLM agents to control characters in the open-world game of Minecraft; and MineCollab, a benchmark to test the different dimensions of embodied and collaborative reasoning. An experimental study finds that the primary bottleneck in collaborating effectively for current state-of-the-art agents is efficient natural language communication, with agent performance dropping as much as 15% when they are required to communicate detailed task completion plans. We conclude that existing LLM agents are ill-optimized for multi-agent collaboration, especially in embodied scenarios, and highlight the need to employ methods beyond in-context and imitation learning. Our website can be found here: https://mindcraft-minecollab.github.io/
Problem

Research questions and friction points this paper is trying to address.

How LLMs collaborate for embodied reasoning tasks
Evaluating communication bottlenecks in multi-agent systems
Optimizing LLM agents for embodied multi-agent collaboration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent LLM framework for embodied reasoning
MINDcraft platform for LLM agents in Minecraft
Benchmark MineCollab for collaborative reasoning
šŸ”Ž Similar Papers
No similar papers found.
I
Isadora White
University of California, San Diego
Kolby Nottingham
Kolby Nottingham
Latitude Games Inc
LLM AgentsAI for GamesGenerative AIreinforcement learning
A
Ayush Maniar
University of California, San Diego
M
Max Robinson
Emergent Garden
H
Hansen Lillemark
University of California, San Diego
M
Mehul Maheshwari
University of California, San Diego
Lianhui Qin
Lianhui Qin
UC San Diego, Computer Science and Engineering
Natural Language ProcessingMachine Learning
Prithviraj Ammanabrolu
Prithviraj Ammanabrolu
Assistant Professor, University of California, San Diego
Reinforcement LearningNatural Language ProcessingInteractive NarrativeKnowledge Graphs