Multi-Agent Actor-Critic with Harmonic Annealing Pruning for Dynamic Spectrum Access Systems

📅 2025-03-19

📈 Citations: 0

✨ Influential: 0

career value

253K/year

🤖 AI Summary

To address the deployment challenges of multi-agent deep reinforcement learning (MADRL) for dynamic spectrum access (DSA) on resource-constrained edge devices, this paper proposes a progressive structured pruning method tailored to the independent-actor-global-critic recurrent MARL framework. Our key contributions are: (1) a harmonic annealing sparsity scheduler that significantly outperforms linear and polynomial scheduling under high pruning rates; and (2) the first integration of progressive pruning into the recurrent MARL training pipeline, enabling joint optimization of model compactness and policy performance. Experiments demonstrate that our method consistently surpasses conventional DSA approaches, mainstream MADRL baselines, and state-of-the-art pruning techniques across diverse training settings. Remarkably, at sparsity levels exceeding 90%, policy performance improves rather than degrades—enabling decentralized, low-overhead, real-time spectrum decision-making on edge hardware.

Technology Category

Application Category

📝 Abstract

Multi-Agent Deep Reinforcement Learning (MADRL) has emerged as a powerful tool for optimizing decentralized decision-making systems in complex settings, such as Dynamic Spectrum Access (DSA). However, deploying deep learning models on resource-constrained edge devices remains challenging due to their high computational cost. To address this challenge, in this paper, we present a novel sparse recurrent MARL framework integrating gradual neural network pruning into the independent actor global critic paradigm. Additionally, we introduce a harmonic annealing sparsity scheduler, which achieves comparable, and in certain cases superior, performance to standard linear and polynomial pruning schedulers at large sparsities. Our experimental investigation demonstrates that the proposed DSA framework can discover superior policies, under diverse training conditions, outperforming conventional DSA, MADRL baselines, and state-of-the-art pruning techniques.

Problem

Research questions and friction points this paper is trying to address.

Optimizing decentralized decision-making in Dynamic Spectrum Access systems.

Reducing computational cost for deep learning on edge devices.

Improving performance with harmonic annealing sparsity scheduler.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse recurrent MARL framework for DSA

Harmonic annealing sparsity scheduler introduced

Gradual neural network pruning integrated

🔎 Similar Papers

No similar papers found.

Nvidia

192,000 USD - 304,750 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5

US, CA, Santa Clara / US, WA, Seattle

Master Thesis Bridging the Gap between Reinforcement Learning & E2E Driving

Bosch Group

Renningen, BW, DE

Authors to Follow