š¤ AI Summary
To address coordination challenges in heterogeneous multi-robot systems arising from disparities in physical capabilities, this paper proposes an embodied intelligenceāoriented operating system framework. Methodologically, it integrates large language models, forward/inverse kinematics solvers, hierarchical task planning, and high-fidelity simulation. Key contributions include: (1) the Robot Resume mechanismāfirst of its kindāwhich automatically parses URDF models and invokes kinematic tools to generate standardized, declarative descriptions of robotsā physical capabilities; (2) an embodied-perceptionāenabled hierarchical multi-agent architecture that explicitly decouples high-level task planning from low-level motion execution; and (3) the Habitat-MAS simulation benchmark for evaluating multi-robot embodied AI. The framework is validated across diverse embodied tasksāincluding manipulation, navigation, and cross-floor object rearrangementādemonstrating that Robot Resume and the hierarchical design significantly improve both collaborative efficiency and generalization across heterogeneous robot teams.
š Abstract
Heterogeneous multi-robot systems (HMRS) have emerged as a powerful approach for tackling complex tasks that single robots cannot manage alone. Current large-language-model-based multi-agent systems (LLM-based MAS) have shown success in areas like software development and operating systems, but applying these systems to robot control presents unique challenges. In particular, the capabilities of each agent in a multi-robot system are inherently tied to the physical composition of the robots, rather than predefined roles. To address this issue, we introduce a novel multi-agent framework designed to enable effective collaboration among heterogeneous robots with varying embodiments and capabilities, along with a new benchmark named Habitat-MAS. One of our key designs is $ extit{Robot Resume}$: Instead of adopting human-designed role play, we propose a self-prompted approach, where agents comprehend robot URDF files and call robot kinematics tools to generate descriptions of their physics capabilities to guide their behavior in task planning and action execution. The Habitat-MAS benchmark is designed to assess how a multi-agent framework handles tasks that require embodiment-aware reasoning, which includes 1) manipulation, 2) perception, 3) navigation, and 4) comprehensive multi-floor object rearrangement. The experimental results indicate that the robot's resume and the hierarchical design of our multi-agent system are essential for the effective operation of the heterogeneous multi-robot system within this intricate problem context.