The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition

📅 2019-01-23

🏛️ arXiv.org

📈 Citations: 34

✨ Influential: 3

career value

217K/year

🤖 AI Summary

Existing multi-agent reinforcement learning (MARL) methods suffer from poor generalization under diverse reward structures and heterogeneous opponents, alongside limited scalability. This paper introduces the first standardized MARL competition platform built on Minecraft, leveraging Malmo for distributed training and integrating PPO, Q-learning, opponent modeling, and meta-learning to enable unified agent training across games and opponent types. Its key contributions are: (i) establishing the first multi-game, multi-opponent benchmark in a 3D open-world environment, with generalization—rather than single-task overfitting—as the core evaluation metric; and (ii) demonstrating significantly improved win-rate stability across diverse maps, rule sets, and opponent configurations. The platform provides a critical, reproducible benchmark and technical framework for MARL research toward artificial general intelligence (AGI).

📝 Abstract

Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems in multiple games with general reward settings and different opponent types. The Multi-Agent Reinforcement Learning in Malm""O (MARL""O) competition is a new challenge that proposes research in this domain using multiple 3D games. The goal of this contest is to foster research in general agents that can learn across different games and opponent types, proposing a challenge as a milestone in the direction of Artificial General Intelligence.

Problem

Research questions and friction points this paper is trying to address.

Address scalability issues in multi-agent reinforcement learning

Develop general agents for diverse games and opponents

Advance research toward Artificial General Intelligence

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent Reinforcement Learning in 3D games

Scalable learning across different opponent types

General reward settings for diverse games

🔎 Similar Papers

No similar papers found.