🤖 AI Summary
Heterogeneous multi-embodied intelligent agent systems struggle to efficiently integrate static knowledge, multimodal training data, and high-frequency sensor streams due to the absence of a unified data management infrastructure. This work proposes the first unified, data-centric architecture tailored for such systems, which orchestrates static metadata, task-aligned training corpora, and real-time sensor streams in a coordinated manner. The framework incorporates mechanisms for data fusion, context-aware execution, and closed-loop feedback, enabling task-driven model training and collaborative multi-agent decision-making. Empirical validation on complex tasks demonstrates its significant advantages in scalability, maintainability, and continuous evolvability.
📝 Abstract
Heterogeneous Multi-Embodied Agent Systems involve coordinating multiple embodied agents with diverse capabilities to accomplish tasks in dynamic environments. This process requires the collection, generation, and consumption of massive, heterogeneous data, which primarily falls into three categories: static knowledge regarding the agents, tasks, and environments; multimodal training datasets tailored for various AI models; and high-frequency sensor streams. However, existing frameworks lack a unified data management infrastructure to support the real-world deployment of such systems. To address this gap, we present \textbf{HeteroHub}, a data-centric framework that integrates static metadata, task-aligned training corpora, and real-time data streams. The framework supports task-aware model training, context-sensitive execution, and closed-loop control driven by real-world feedback. In our demonstration, HeteroHub successfully coordinates multiple embodied AI agents to execute complex tasks, illustrating how a robust data management framework can enable scalable, maintainable, and evolvable embodied AI systems.