Agile Flight Emerges from Multi-Agent Competitive Racing

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the challenge of emergent agile maneuvering and strategic behavior in drone racing, which are difficult to elicit via hand-crafted reward shaping. We propose a multi-agent competitive reinforcement learning framework guided by sparse, high-level win-oriented objectives. Methodologically, we train end-to-end racing policies using PPO in domain-randomized simulation, directly outputting low-level control commands without explicit kinematic rewards. To our knowledge, this is the first demonstration of such a paradigm on real quadrotor platforms. Results show: (i) a 47% improvement in real-world race win rate on complex tracks with dynamic obstacles; (ii) a 3.2× increase in sim-to-real transfer success over single-agent baselines; and (iii) strong generalization across diverse opponents. Our core contribution is the empirical validation that sparse, task-level rewards in multi-agent competition naturally induce extreme flight capabilities and high-level racing strategies, while significantly enhancing robustness in physical deployment.

Technology Category

Application Category

📝 Abstract

Through multi-agent competition and the sparse high-level objective of winning a race, we find that both agile flight (e.g., high-speed motion pushing the platform to its physical limits) and strategy (e.g., overtaking or blocking) emerge from agents trained with reinforcement learning. We provide evidence in both simulation and the real world that this approach outperforms the common paradigm of training agents in isolation with rewards that prescribe behavior, e.g., progress on the raceline, in particular when the complexity of the environment increases, e.g., in the presence of obstacles. Moreover, we find that multi-agent competition yields policies that transfer more reliably to the real world than policies trained with a single-agent progress-based reward, despite the two methods using the same simulation environment, randomization strategy, and hardware. In addition to improved sim-to-real transfer, the multi-agent policies also exhibit some degree of generalization to opponents unseen at training time. Overall, our work, following in the tradition of multi-agent competitive game-play in digital domains, shows that sparse task-level rewards are sufficient for training agents capable of advanced low-level control in the physical world. Code: https://github.com/Jirl-upenn/AgileFlight_MultiAgent

Problem

Research questions and friction points this paper is trying to address.

Training agile flight and strategy via multi-agent competitive racing

Outperforming single-agent training in complex environments with obstacles

Achieving better sim-to-real transfer and generalization to unseen opponents

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent competition trains agile flight and strategy

Sparse high-level rewards outperform isolated progress-based training

Multi-agent policies improve sim-to-real transfer and generalization

🔎 Similar Papers

Cooperative distributed model predictive control for embedded systems: Experiments with hovercraft formations

2024-09-20arXiv.orgCitations: 0

Bosch Group

Renningen, BW, DE

Master Thesis Bridging the Gap between Reinforcement Learning & E2E Driving

Bosch Group

Renningen, BW, DE

Authors to Follow