The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition

📅 2019-01-23
🏛️ arXiv.org
📈 Citations: 34
Influential: 3
📄 PDF
🤖 AI Summary
Existing multi-agent reinforcement learning (MARL) methods suffer from poor generalization under diverse reward structures and heterogeneous opponents, alongside limited scalability. This paper introduces the first standardized MARL competition platform built on Minecraft, leveraging Malmo for distributed training and integrating PPO, Q-learning, opponent modeling, and meta-learning to enable unified agent training across games and opponent types. Its key contributions are: (i) establishing the first multi-game, multi-opponent benchmark in a 3D open-world environment, with generalization—rather than single-task overfitting—as the core evaluation metric; and (ii) demonstrating significantly improved win-rate stability across diverse maps, rule sets, and opponent configurations. The platform provides a critical, reproducible benchmark and technical framework for MARL research toward artificial general intelligence (AGI).
📝 Abstract
Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems in multiple games with general reward settings and different opponent types. The Multi-Agent Reinforcement Learning in Malm""O (MARL""O) competition is a new challenge that proposes research in this domain using multiple 3D games. The goal of this contest is to foster research in general agents that can learn across different games and opponent types, proposing a challenge as a milestone in the direction of Artificial General Intelligence.
Problem

Research questions and friction points this paper is trying to address.

Address scalability issues in multi-agent reinforcement learning
Develop general agents for diverse games and opponents
Advance research toward Artificial General Intelligence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent Reinforcement Learning in 3D games
Scalable learning across different opponent types
General reward settings for diverse games
🔎 Similar Papers
No similar papers found.
D
Diego Pérez-Liébana
Game AI group, Queen Mary University of London (UK)
Katja Hofmann
Katja Hofmann
Microsoft Research
Machine LearningGenerative ModelsReinforcement LearningVideo Games
S
S. Mohanty
École Polytechnique Fédérale de Lausanne (Switzerland)
N
Noburu Kuno
Microsoft Research (UK)
A
André Kramer
Microsoft Research (UK)
Sam Devlin
Sam Devlin
Meta
Game AIImitation LearningMulti-Agent SystemsReinforcement LearningWorld Models
R
Raluca D. Gaina
Game AI group, Queen Mary University of London (UK)
D
Daniel Ionita
Game AI group, Queen Mary University of London (UK)