🤖 AI Summary
This paper addresses the challenges of Resource Allocation Optimization (RAO) in dynamic, decentralized environments. To tackle these challenges, it systematically surveys state-of-the-art applications of Multi-Agent Reinforcement Learning (MARL) to RAO. We propose the first three-dimensional taxonomy for RAO—spanning collaboration structure, communication mechanism, and learning paradigm—unifying over 120 recent works and constructing a comprehensive technical landscape across key domains including network slicing, edge computing, and smart grids. By integrating mainstream MARL methodologies—including value decomposition, policy gradient methods, communication-aware learning, and opponent modeling—we establish a method-to-use-case mapping framework. Furthermore, we release an open, continuously updated MARL-RAO research roadmap, accompanied by a technology selection guide and a practical evaluation framework. Our work significantly enhances the deployability of RAO solutions in real-world systems, improving scalability, robustness, and operational feasibility.
📝 Abstract
Multi-Agent Reinforcement Learning (MARL) has become a powerful framework for numerous real-world applications, modeling distributed decision-making and learning from interactions with complex environments. Resource Allocation Optimization (RAO) benefits significantly from MARL's ability to tackle dynamic and decentralized contexts. MARL-based approaches are increasingly applied to RAO challenges across sectors playing pivotal roles to Industry 4.0 developments. This survey provides a comprehensive review of recent MARL algorithms for RAO, encompassing core concepts, classifications, and a structured taxonomy. By outlining the current research landscape and identifying primary challenges and future directions, this survey aims to support researchers and practitioners in leveraging MARL's potential to advance resource allocation solutions.