🤖 AI Summary
Existing MARL benchmark platforms primarily target virtual environments or low-mobility robots, lacking standardized cooperative-competitive evaluation for high-dynamics physical systems—especially multi-UAV platforms. This paper introduces VolleyBots, the first real-world physical MARL testbed enabling multi-UAV volleyball competition. Methodologically, it establishes a rule-compliant cooperative-competitive multi-agent testbed grounded in volleyball dynamics; designs a hierarchical decision architecture unifying low-level flight control with high-level game-theoretic strategy; and achieves high-fidelity sim-to-real zero-shot transfer. The platform integrates state-of-the-art MARL algorithms (MAPPO, QMix), Nash equilibrium solvers, ROS/Gazebo simulation, PID/RL hybrid controllers, and UWB-based localization. Experimental results reveal critical performance bottlenecks of mainstream MARL methods in simulation and demonstrate end-to-end autonomous execution—including serve, block, and spike—by a real quadcopter swarm.
📝 Abstract
Multi-agent reinforcement learning (MARL) has made significant progress, largely fueled by the development of specialized testbeds that enable systematic evaluation of algorithms in controlled yet challenging scenarios. However, existing testbeds often focus on purely virtual simulations or limited robot morphologies such as robotic arms, quadrupeds, and humanoids, leaving high-mobility platforms with real-world physical constraints like drones underexplored. To bridge this gap, we present VolleyBots, a new MARL testbed where multiple drones cooperate and compete in the sport of volleyball under physical dynamics. VolleyBots features a turn-based interaction model under volleyball rules, a hierarchical decision-making process that combines motion control and strategic play, and a high-fidelity simulation for seamless sim-to-real transfer. We provide a comprehensive suite of tasks ranging from single-drone drills to multi-drone cooperative and competitive tasks, accompanied by baseline evaluations of representative MARL and game-theoretic algorithms. Results in simulation show that while existing algorithms handle simple tasks effectively, they encounter difficulty in complex tasks that require both low-level control and high-level strategy. We further demonstrate zero-shot deployment of a simulation-learned policy to real-world drones, highlighting VolleyBots' potential to propel MARL research involving agile robotic platforms. The project page is at https://sites.google.com/view/volleybots/home.