π€ AI Summary
To address the challenge of balancing global optimality and local responsiveness in multi-agent pathfinding under dynamic, partially observable environments, this paper proposes a hybrid framework integrating D* Lite with a multi-agent reinforcement learning variant (MAPPO). The method introduces three key innovations: (1) a shared exploration map representation for coordinated environmental awareness; (2) an adaptive online policy switching mechanism enabling context-aware transitions between global re-planning and local decision-making; and (3) a freezing prevention strategy to ensure continuous agent progress. It jointly leverages D* Liteβs incremental graph search, shared attention-based map encoding, and dynamic environment modeling. Evaluated on the POGEMA benchmark, the approach achieves a 21.3% improvement in task success rate, a 36.7% reduction in collision rate, and an 18.9% increase in path efficiency. Real-world validation on EyeSim confirms robust scalability to hundred-agent scenarios and high-dynamic environments.
π Abstract
Multi-Agent Pathfinding is used in areas including multi-robot formations, warehouse logistics, and intelligent vehicles. However, many environments are incomplete or frequently change, making it difficult for standard centralized planning or pure reinforcement learning to maintain both global solution quality and local flexibility. This paper introduces a hybrid framework that integrates D* Lite global search with multi-agent reinforcement learning, using a switching mechanism and a freeze-prevention strategy to handle dynamic conditions and crowded settings. We evaluate the framework in the discrete POGEMA environment and compare it with baseline methods. Experimental outcomes indicate that the proposed framework substantially improves success rate, collision rate, and path efficiency. The model is further tested on the EyeSim platform, where it maintains feasible Pathfinding under frequent changes and large-scale robot deployments.