Learning to Play Pursuit-Evasion with Dynamic and Sensor Constraints

📅 2024-05-08

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

This study addresses the two-agent pursuit-evasion game under car-like dynamics and local sensing constraints, formulated as a partially observable stochastic zero-sum Markov game. We propose a multi-stage curriculum reinforcement learning method that jointly incorporates dynamic and perceptual constraints. Innovatively, we integrate curriculum learning into the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) framework to enable co-optimization of pursuit and evasion policies. The approach is implemented and validated in real time on physical robotic platforms—F1TENTH and JetRacer—using ROS. Experimental results demonstrate a 30% improvement in capture rate and a 5% increase in escape rate compared to baselines. Crucially, policies remain stable under high-speed indoor navigation at 2 m/s, confirming strong generalization, robustness, and physical transferability across heterogeneous robot platforms.

Technology Category

Application Category

📝 Abstract

We present a multi-agent reinforcement learning approach to solve a pursuit-evasion game between two players with car-like dynamics and sensing limitations. We develop a curriculum for an existing multi-agent deterministic policy gradient algorithm to simultaneously obtain strategies for both players, and deploy the learned strategies on real robots moving as fast as 2 m/s in indoor environments. Through experiments we show that the learned strategies improve over existing baselines by up to 30% in terms of capture rate for the pursuer. The learned evader model has up to 5% better escape rate over the baselines even against our competitive pursuer model. We also present experiment results which show how the pursuit-evasion game and its results evolve as the player dynamics and sensor constraints are varied. Finally, we deploy learned policies on physical robots for a game between the F1TENTH and JetRacer platforms and show that the learned strategies can be executed on real-robots. Our code and supplementary material including videos from experiments are available at https: //gonultasbu.github.io/pursuit-evasion/.

Problem

Research questions and friction points this paper is trying to address.

Develop learning-based method for car-like robots with sensor constraints

Improve capture rate in partially observable pursuit-evasion games

Enable real-robot deployment of learned strategies at high speeds

Innovation

Methods, ideas, or system contributions that make the work stand out.

Encodes history into belief state

Improves capture rate by 16%

Deploys on real robots at 2m/s

🔎 Similar Papers

Online Planning for Multi-UAV Pursuit-Evasion in Unknown Environments Using Deep Reinforcement Learning