🤖 AI Summary
This work addresses natural-language-instruction-driven multi-robot collaborative pickup-and-delivery tasks. Methodologically, it introduces a language-driven hierarchical architecture: (i) a lightweight LLaMA3 model parses spoken or textual instructions; (ii) Voronoi tessellation dynamically partitions the workspace and generates boundary relay points to enable seamless robot handover; and (iii) a finite-state machine integrated with ROS2 handles real-time path planning and low-level control. Evaluated in Gazebo simulation and on physical TurtleBot3 platforms, the system demonstrates scalability to arbitrary team sizes with stable task completion cost, 55% average per-robot workload reduction, and sublinear growth in relay agent count—significantly enhancing resource efficiency and scalability. The core contribution is a tightly coupled language–spatial–control design, pioneering the use of Voronoi-based relay mechanisms for natural-language-guided multi-robot cooperative delivery.
📝 Abstract
We present DELIVER (Directed Execution of Language-instructed Item Via Engineered Relay), a fully integrated framework for cooperative multi-robot pickup and delivery driven by natural language commands. DELIVER unifies natural language understanding, spatial decomposition, relay planning, and motion execution to enable scalable, collision-free coordination in real-world settings. Given a spoken or written instruction, a lightweight instance of LLaMA3 interprets the command to extract pickup and delivery locations. The environment is partitioned using a Voronoi tessellation to define robot-specific operating regions. Robots then compute optimal relay points along shared boundaries and coordinate handoffs. A finite-state machine governs each robot's behavior, enabling robust execution. We implement DELIVER on the MultiTRAIL simulation platform and validate it in both ROS2-based Gazebo simulations and real-world hardware using TurtleBot3 robots. Empirical results show that DELIVER maintains consistent mission cost across varying team sizes while reducing per-agent workload by up to 55% compared to a single-agent system. Moreover, the number of active relay agents remains low even as team size increases, demonstrating the system's scalability and efficient agent utilization. These findings underscore DELIVER's modular and extensible architecture for language-guided multi-robot coordination, advancing the frontiers of cyber-physical system integration.