Results of the NeurIPS 2023 Neural MMO Competition on Multi-task Reinforcement Learning

📅 2025-08-17

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work addresses the challenge of cross-task and cross-environment generalization in multi-task reinforcement learning. We propose a unified training framework based on goal-conditioned policies and instantiate it in Neural MMO—a high-complexity, open-world multi-agent environment. Methodologically, we decouple goal representation from the policy network and jointly optimize generalization across three previously unseen dimensions: tasks, maps, and opponent policies. Experiments demonstrate that our best configuration achieves four times the baseline score within eight hours on a single GPU, while significantly improving zero-shot transfer performance. To foster reproducibility and community advancement, we open-source all code and model weights; the project has already attracted over 200 researchers. This work establishes a rigorous, reproducible benchmark and a principled technical paradigm for generalization in open-world multi-agent reinforcement learning.

Technology Category

Application Category

📝 Abstract

We present the results of the NeurIPS 2023 Neural MMO Competition, which attracted over 200 participants and submissions. Participants trained goal-conditional policies that generalize to tasks, maps, and opponents never seen during training. The top solution achieved a score 4x higher than our baseline within 8 hours of training on a single 4090 GPU. We open-source everything relating to Neural MMO and the competition under the MIT license, including the policy weights and training code for our baseline and for the top submissions.

Problem

Research questions and friction points this paper is trying to address.

Generalizing goal-conditional policies to unseen tasks

Training agents that adapt to new maps and opponents

Solving multi-task reinforcement learning competition challenges

Innovation

Methods, ideas, or system contributions that make the work stand out.

Goal-conditional policies for generalization

Single GPU training with high efficiency

Open-source competition framework and solutions

🔎 Similar Papers

Hybrid Training for Enhanced Multi-task Generalization in Multi-agent Reinforcement Learning