MARL-GPT: Foundation Model for Multi-Agent Reinforcement Learning

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes MARL-GPT, the first Transformer-based unified foundation model for multi-agent reinforcement learning (MARL), addressing the limited generalization of conventional task-specific approaches. Trained via offline reinforcement learning on over 1.5 billion steps of expert trajectories across diverse environments—including StarCraft, Google Research Football, and POGEMA—the model achieves performance comparable to specialized methods without requiring task-specific fine-tuning. By demonstrating strong zero-shot transfer across significantly different multi-agent tasks, MARL-GPT provides the first empirical validation of the feasibility of building general-purpose foundation models in MARL, thereby advancing the field toward a foundation model paradigm.
📝 Abstract
Recent advances in multi-agent reinforcement learning (MARL) have demonstrated success in numerous challenging domains and environments, but typically require specialized models for each task. In this work, we propose a coherent methodology that makes it possible for a single GPT-based model to learn and perform well across diverse MARL environments and tasks, including StarCraft Multi-Agent Challenge, Google Research Football and POGEMA. Our method, MARL-GPT, applies offline reinforcement learning to train at scale on the expert trajectories (400M for SMACv2, 100M for GRF, and 1B for POGEMA) combined with a single transformer-based observation encoder that requires no task-specific tuning. Experiments show that MARL-GPT achieves competitive performance compared to specialized baselines in all tested environments. Thus, our findings suggest that it is, indeed, possible to build a multi-task transformer-based model for a wide variety of (significantly different) multi-agent problems paving the way to the fundamental MARL model (akin to ChatGPT, Llama, Mistral etc. in natural language modeling).
Problem

Research questions and friction points this paper is trying to address.

multi-agent reinforcement learning
foundation model
multi-task learning
transformer-based model
generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

foundation model
multi-agent reinforcement learning
offline reinforcement learning
transformer-based architecture
multi-task learning
🔎 Similar Papers
No similar papers found.