🤖 AI Summary
This work addresses the task allocation challenge in heterogeneous human–UAV–UGV coordination for emergency response under communication constraints, partial observability, and stringent time-critical requirements. We propose the first Dec-POMDP modeling framework specifically designed for emergency crowdsourced sensing. To enhance coordination, we introduce a “hard collaboration” mechanism wherein UGVs proactively recharge low-battery UAVs. Furthermore, we develop HECTA4ER, a novel multi-agent reinforcement learning algorithm that integrates historical hidden-state modeling, task-specific feature extraction, and a hybrid global-local information fusion mechanism. Experimental results demonstrate that our approach achieves an average 18.42% improvement in task completion rate in simulation and exhibits strong robustness and practical efficacy in real-world dynamic emergency scenarios.
📝 Abstract
Mobile crowdsensing is evolving beyond traditional human-centric models by integrating heterogeneous entities like unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs). Optimizing task allocation among these diverse agents is critical, particularly in challenging emergency rescue scenarios characterized by complex environments, limited communication, and partial observability. This paper tackles the Heterogeneous-Entity Collaborative-Sensing Task Allocation (HECTA) problem specifically for emergency rescue, considering humans, UAVs, and UGVs. We introduce a novel ``Hard-Cooperative'' policy where UGVs prioritize recharging low-battery UAVs, alongside performing their sensing tasks. The primary objective is maximizing the task completion rate (TCR) under strict time constraints. We rigorously formulate this NP-hard problem as a decentralized partially observable Markov decision process (Dec-POMDP) to effectively handle sequential decision-making under uncertainty. To solve this, we propose HECTA4ER, a novel multi-agent reinforcement learning algorithm built upon a Centralized Training with Decentralized Execution architecture. HECTA4ER incorporates tailored designs, including specialized modules for complex feature extraction, utilization of action-observation history via hidden states, and a mixing network integrating global and local information, specifically addressing the challenges of partial observability. Furthermore, theoretical analysis confirms the algorithm's convergence properties. Extensive simulations demonstrate that HECTA4ER significantly outperforms baseline algorithms, achieving an average 18.42% increase in TCR. Crucially, a real-world case study validates the algorithm's effectiveness and robustness in dynamic sensing scenarios, highlighting its strong potential for practical application in emergency response.