Multi-Agent Reinforcement Learning for Task Offloading in Wireless Edge Networks

📅 2025-09-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing resource contention and communication constraints in task offloading for wireless edge networks, this paper proposes a decentralized multi-agent reinforcement learning (MARL) framework. Methodologically, we formulate the problem as a constrained Markov decision process (CMDP), introduce a dynamically updated shared constraint vector to enable lightweight implicit coordination, and integrate decentralized policy optimization with a sparse constraint-update mechanism—achieving global resource objective alignment under extremely low communication overhead. We theoretically establish convergence guarantees and enhance policy robustness via safe reinforcement learning. Experiments demonstrate that our approach significantly outperforms centralized and independent baselines in large-scale scenarios, effectively balancing local decision efficiency with global load balancing. The core contribution lies in replacing conventional explicit coordination with a scalable, low-communication-consumption constraint-sharing mechanism, enabling efficient and adaptive distributed control in resource-constrained edge environments.

Technology Category

Application Category

📝 Abstract
In edge computing systems, autonomous agents must make fast local decisions while competing for shared resources. Existing MARL methods often resume to centralized critics or frequent communication, which fail under limited observability and communication constraints. We propose a decentralized framework in which each agent solves a constrained Markov decision process (CMDP), coordinating implicitly through a shared constraint vector. For the specific case of offloading, e.g., constraints prevent overloading shared server resources. Coordination constraints are updated infrequently and act as a lightweight coordination mechanism. They enable agents to align with global resource usage objectives but require little direct communication. Using safe reinforcement learning, agents learn policies that meet both local and global goals. We establish theoretical guarantees under mild assumptions and validate our approach experimentally, showing improved performance over centralized and independent baselines, especially in large-scale settings.
Problem

Research questions and friction points this paper is trying to address.

Decentralized multi-agent task offloading in wireless edge networks
Agents compete for shared resources under communication constraints
Achieve global resource coordination with minimal direct communication
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized CMDP framework for agents
Implicit coordination via shared constraint vectors
Safe reinforcement learning for global objectives