🤖 AI Summary
Addressing critical challenges in multi-robot collaboration—namely task hallucination (generating infeasible instructions) and violation of environmental constraints—this paper proposes a hierarchical embodied agent framework. Methodologically, it introduces: (i) a novel hallucination-resilient mechanism integrating next-action prediction with structured memory for long-horizon task decomposition and dynamic feasibility verification; (ii) MultiPlan+, a large-scale benchmark dataset comprising over 18,000 expert-annotated samples focused specifically on unrealistic tasks; and (iii) RPAS, a comprehensive evaluation framework combining automated metrics with LLM-based expert scoring. The approach unifies hierarchical decision-making, multimodal constraint modeling, and LLM-augmented assessment. Experimental results demonstrate that our method achieves an RPAS score of 71.85%, significantly surpassing state-of-the-art baselines. Furthermore, it successfully enables sustained, cross-platform robotic collaboration for two hours in real-world office environments.
📝 Abstract
This paper introduces EmbodiedAgent, a hierarchical framework for heterogeneous multi-robot control. EmbodiedAgent addresses critical limitations of hallucination in impractical tasks. Our approach integrates a next-action prediction paradigm with a structured memory system to decompose tasks into executable robot skills while dynamically validating actions against environmental constraints. We present MultiPlan+, a dataset of more than 18,000 annotated planning instances spanning 100 scenarios, including a subset of impractical cases to mitigate hallucination. To evaluate performance, we propose the Robot Planning Assessment Schema (RPAS), combining automated metrics with LLM-aided expert grading. Experiments demonstrate EmbodiedAgent's superiority over state-of-the-art models, achieving 71.85% RPAS score. Real-world validation in an office service task highlights its ability to coordinate heterogeneous robots for long-horizon objectives.