🤖 AI Summary
LLMs-powered embodied multi-agent systems often lack long-horizon strategic planning and suffer from redundant actions and task failure in complex collaborative tasks (e.g., search-and-rescue). To address this, we propose a two-stage cooperative planning framework: (1) multi-agent negotiation to generate structured, long-horizon meta-plans; and (2) dynamic path re-planning guided by real-time environmental feedback. Our key contribution is the first progress-adaptive meta-planning mechanism, inspired by human staged negotiation protocols, which jointly ensures strategic coherence and execution flexibility. The method integrates iterative LLM-based negotiation, meta-plan modeling, and embodied interaction in ThreeDWorld. Evaluated on Transport and Watch-And-Help benchmarks, it achieves substantial improvements in task success rate and execution efficiency, consistently outperforming state-of-the-art approaches.
📝 Abstract
In this work, we address the cooperation problem among large language model (LLM) based embodied agents, where agents must cooperate to achieve a common goal. Previous methods often execute actions extemporaneously and incoherently, without long-term strategic and cooperative planning, leading to redundant steps, failures, and even serious repercussions in complex tasks like search-and-rescue missions where discussion and cooperative plan are crucial. To solve this issue, we propose Cooperative Plan Optimization (CaPo) to enhance the cooperation efficiency of LLM-based embodied agents. Inspired by human cooperation schemes, CaPo improves cooperation efficiency with two phases: 1) meta-plan generation, and 2) progress-adaptive meta-plan and execution. In the first phase, all agents analyze the task, discuss, and cooperatively create a meta-plan that decomposes the task into subtasks with detailed steps, ensuring a long-term strategic and coherent plan for efficient coordination. In the second phase, agents execute tasks according to the meta-plan and dynamically adjust it based on their latest progress (e.g., discovering a target object) through multi-turn discussions. This progress-based adaptation eliminates redundant actions, improving the overall cooperation efficiency of agents. Experimental results on the ThreeDworld Multi-Agent Transport and Communicative Watch-And-Help tasks demonstrate that CaPo achieves much higher task completion rate and efficiency compared with state-of-the-arts.The code is released at https://github.com/jliu4ai/CaPo.