Learning to Play Pursuit-Evasion with Dynamic and Sensor Constraints

📅 2024-05-08
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the two-agent pursuit-evasion game under car-like dynamics and local sensing constraints, formulated as a partially observable stochastic zero-sum Markov game. We propose a multi-stage curriculum reinforcement learning method that jointly incorporates dynamic and perceptual constraints. Innovatively, we integrate curriculum learning into the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) framework to enable co-optimization of pursuit and evasion policies. The approach is implemented and validated in real time on physical robotic platforms—F1TENTH and JetRacer—using ROS. Experimental results demonstrate a 30% improvement in capture rate and a 5% increase in escape rate compared to baselines. Crucially, policies remain stable under high-speed indoor navigation at 2 m/s, confirming strong generalization, robustness, and physical transferability across heterogeneous robot platforms.

Technology Category

Application Category

📝 Abstract
We present a multi-agent reinforcement learning approach to solve a pursuit-evasion game between two players with car-like dynamics and sensing limitations. We develop a curriculum for an existing multi-agent deterministic policy gradient algorithm to simultaneously obtain strategies for both players, and deploy the learned strategies on real robots moving as fast as 2 m/s in indoor environments. Through experiments we show that the learned strategies improve over existing baselines by up to 30% in terms of capture rate for the pursuer. The learned evader model has up to 5% better escape rate over the baselines even against our competitive pursuer model. We also present experiment results which show how the pursuit-evasion game and its results evolve as the player dynamics and sensor constraints are varied. Finally, we deploy learned policies on physical robots for a game between the F1TENTH and JetRacer platforms and show that the learned strategies can be executed on real-robots. Our code and supplementary material including videos from experiments are available at https: //gonultasbu.github.io/pursuit-evasion/.
Problem

Research questions and friction points this paper is trying to address.

Develop learning-based method for car-like robots with sensor constraints
Improve capture rate in partially observable pursuit-evasion games
Enable real-robot deployment of learned strategies at high speeds
Innovation

Methods, ideas, or system contributions that make the work stand out.

Encodes history into belief state
Improves capture rate by 16%
Deploys on real robots at 2m/s
🔎 Similar Papers
No similar papers found.
B
Burak M Gonultas
Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, 55455, USA
Volkan Isler
Volkan Isler
Professor, The University of Texas at Austin
RoboticsAgricultural RoboticsSensor NetworksComputer VisionGeometric Algorithms