🤖 AI Summary
Existing multi-agent reinforcement learning (MARL) methods suffer from poor generalization under diverse reward structures and heterogeneous opponents, alongside limited scalability. This paper introduces the first standardized MARL competition platform built on Minecraft, leveraging Malmo for distributed training and integrating PPO, Q-learning, opponent modeling, and meta-learning to enable unified agent training across games and opponent types. Its key contributions are: (i) establishing the first multi-game, multi-opponent benchmark in a 3D open-world environment, with generalization—rather than single-task overfitting—as the core evaluation metric; and (ii) demonstrating significantly improved win-rate stability across diverse maps, rule sets, and opponent configurations. The platform provides a critical, reproducible benchmark and technical framework for MARL research toward artificial general intelligence (AGI).
📝 Abstract
Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems in multiple games with general reward settings and different opponent types. The Multi-Agent Reinforcement Learning in Malm""O (MARL""O) competition is a new challenge that proposes research in this domain using multiple 3D games. The goal of this contest is to foster research in general agents that can learn across different games and opponent types, proposing a challenge as a milestone in the direction of Artificial General Intelligence.