🤖 AI Summary
In dynamic multi-agent tasks—such as disaster response—environmental disturbances (e.g., weather, obstacles) and agent capability heterogeneity severely undermine the robustness of Courses of Action (COAs).
Method: We propose a joint optimization framework: (1) modeling the task space as an abstract graph; (2) designing a COA pool diversity quantification mechanism that maximizes assignment diversity while preserving agent-task compatibility; and (3) integrating genetic algorithms for multi-agent allocation with a graph neural network–enhanced policy gradient method for single-agent sequential planning.
Results: In simulation, our approach generates 20 high-quality COAs within 50 minutes; task sequencing approaches optimality, and execution performance significantly outperforms random baselines. The framework substantially improves scheduling adaptability and robustness in complex, dynamic environments.
📝 Abstract
Operations in disaster response, search & rescue, and military missions that involve multiple agents demand automated processes to support the planning of the courses of action (COA). Moreover, traverse-affecting changes in the environment (rain, snow, blockades, etc.) may impact the expected performance of a COA, making it desirable to have a pool of COAs that are diverse in task distributions across agents. Further, variations in agent capabilities, which could be human crews and/or autonomous systems, present practical opportunities and computational challenges to the planning process. This paper presents a new theoretical formulation and computational framework to generate such diverse pools of COAs for operations with soft variations in agent-task compatibility. Key to the problem formulation is a graph abstraction of the task space and the pool of COAs itself to quantify its diversity. Formulating the COAs as a centralized multi-robot task allocation problem, a genetic algorithm is used for (order-ignoring) allocations of tasks to each agent that jointly maximize diversity within the COA pool and overall compatibility of the agent-task mappings. A graph neural network is trained using a policy gradient approach to then perform single agent task sequencing in each COA, which maximizes completion rates adaptive to task features. Our tests of the COA generation process in a simulated environment demonstrate significant performance gain over a random walk baseline, small optimality gap in task sequencing, and execution time of about 50 minutes to plan up to 20 COAs for 5 agent/100 task operations.