MACTAS: Self-Attention-Based Module for Inter-Agent Communication in Multi-Agent Reinforcement Learning

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing communication protocols in multi-agent reinforcement learning (MARL) suffer from high complexity, non-differentiability, and poor scalability. To address these issues, this paper proposes a lightweight, fully differentiable self-attention communication module. The module generates agent-specific messages in a reward-driven manner, enabling efficient, end-to-end trainable inter-agent information exchange. Its fixed-parameter architecture ensures that computational and communication overhead remain constant regardless of the number of agents, significantly enhancing scalability. Moreover, the module is natively compatible with mainstream value-decomposition methods (e.g., QMIX) and functions as a plug-and-play enhancement. Evaluated on multiple heterogeneous maps in the SMAC benchmark, our approach achieves state-of-the-art performance, demonstrating superior effectiveness, robustness, and generalization capability.

Technology Category

Application Category

📝 Abstract
Communication is essential for the collective execution of complex tasks by human agents, motivating interest in communication mechanisms for multi-agent reinforcement learning (MARL). However, existing communication protocols in MARL are often complex and non-differentiable. In this work, we introduce a self-attention-based communication module that exchanges information between the agents in MARL. Our proposed approach is fully differentiable, allowing agents to learn to generate messages in a reward-driven manner. The module can be seamlessly integrated with any action-value function decomposition method and can be viewed as an extension of such decompositions. Notably, it includes a fixed number of trainable parameters, independent of the number of agents. Experimental results on the SMAC benchmark demonstrate the effectiveness of our approach, which achieves state-of-the-art performance on several maps.
Problem

Research questions and friction points this paper is trying to address.

Develops self-attention communication module for multi-agent reinforcement learning
Creates differentiable protocol for inter-agent information exchange
Enables reward-driven message generation in collaborative tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-attention-based module for agent communication
Fully differentiable reward-driven message generation
Fixed parameters independent of agent count
🔎 Similar Papers
No similar papers found.
M
Maciej Wojtala
University of Warsaw
B
Bogusz Stefańczyk
IDEAS Research Institute
D
Dominik Bogucki
Institute of Fundamental Technological Research, Polish Academy of Sciences
Łukasz Lepak
Łukasz Lepak
assistant professor, Warsaw University of Technology
machine learningneural networksreinforcement learningai applications
J
Jakub Strykowski
Warsaw University of Technology
Paweł Wawrzyński
Paweł Wawrzyński
IDEAS Research Institute
artificial intelligenceneural networksreinforcement learning