Fixing Incomplete Value Function Decomposition for Multi-Agent Reinforcement Learning

📅 2025-05-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In multi-agent reinforcement learning, value function decomposition must satisfy the Individual-Global-Max (IGM) property to ensure policy consistency; however, mainstream methods like VDN and QMIX suffer from limited expressivity and cannot represent the complete IGM value class, whereas the expressive QPLEX incurs excessive complexity. This paper introduces the QFIX family of models, which provides, for the first time, a concise, differentiable parameterization that fully captures the IGM value class. QFIX integrates a lightweight, learnable monotonic mixing correction layer into the VDN/QMIX architecture. This design achieves theoretical completeness while substantially reducing parameter count and computational overhead, thereby enhancing training stability and convergence speed. Empirical evaluation on SMACv2 and Overcooked benchmarks demonstrates that QFIX consistently outperforms VDN and QMIX, matches or exceeds QPLEX in performance, and establishes new state-of-the-art results.

Technology Category

Application Category

📝 Abstract
Value function decomposition methods for cooperative multi-agent reinforcement learning compose joint values from individual per-agent utilities, and train them using a joint objective. To ensure that the action selection process between individual utilities and joint values remains consistent, it is imperative for the composition to satisfy the individual-global max (IGM) property. Although satisfying IGM itself is straightforward, most existing methods (e.g., VDN, QMIX) have limited representation capabilities and are unable to represent the full class of IGM values, and the one exception that has no such limitation (QPLEX) is unnecessarily complex. In this work, we present a simple formulation of the full class of IGM values that naturally leads to the derivation of QFIX, a novel family of value function decomposition models that expand the representation capabilities of prior models by means of a thin"fixing"layer. We derive multiple variants of QFIX, and implement three variants in two well-known multi-agent frameworks. We perform an empirical evaluation on multiple SMACv2 and Overcooked environments, which confirms that QFIX (i) succeeds in enhancing the performance of prior methods, (ii) learns more stably and performs better than its main competitor QPLEX, and (iii) achieves this while employing the simplest and smallest mixing models.
Problem

Research questions and friction points this paper is trying to address.

Enhancing representation of IGM values in multi-agent reinforcement learning
Simplifying complex models for value function decomposition
Improving performance and stability in cooperative multi-agent systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

QFIX introduces a thin fixing layer
QFIX expands representation capabilities
QFIX simplifies IGM value decomposition
🔎 Similar Papers
No similar papers found.
Andrea Baisero
Andrea Baisero
PhD student, Northeastern University
Reinforcement LearningPartial ObservabilityDecentralized Multi-Agent Control
R
R. Bhati
Khoury College of Computer Sciences, Northeastern University
S
Shuo Liu
Khoury College of Computer Sciences, Northeastern University
A
Aathira Pillai
Khoury College of Computer Sciences, Northeastern University
Christopher Amato
Christopher Amato
Associate Professor at Northeastern University
Artificial IntelligenceMulti-Agent SystemsMulti-Robot SystemsReinforcement Learning