🤖 AI Summary
This work addresses the real-time multi-robot task allocation (MRTA) problem in dynamic warehouse environments, aiming to jointly minimize total travel distance and task completion latency under practical constraints including battery endurance and collision-free navigation. We propose the first MRTA framework integrating a two-agent reinforcement learning self-play paradigm with continuous-motion modeling. Specifically, we design MRTAgent—a collaborative decision-making architecture that unifies task assignment and heterogeneous robot coordination via an improved Linear Quadratic Regulator (LQR)-based continuous control policy. Evaluated in a high-fidelity warehouse simulator, our approach reduces average task latency by 37% and total travel distance by 29%, while guaranteeing 100% collision-free operation and sustainable battery usage—thereby comprehensively addressing key practical dimensions of MRTA.
📝 Abstract
Efficient task allocation among multiple robots is crucial for optimizing productivity in modern warehouses, particularly in response to the increasing demands of online order fulfillment. This paper addresses the real-time multi-robot task allocation (MRTA) problem in dynamic warehouse environments, where tasks emerge with specified start and end locations. The objective is to minimize both the total travel distance of robots and delays in task completion, while also considering practical constraints such as battery management and collision avoidance. We introduce MRTAgent, a dual-agent Reinforcement Learning (RL) framework inspired by self-play, designed to optimize task assignments and robot selection to ensure timely task execution. For safe navigation, a modified linear quadratic controller (LQR) approach is employed. To the best of our knowledge, MRTAgent is the first framework to address all critical aspects of practical MRTA problems while supporting continuous robot movements.