Multi-Agent Reinforcement Learning for Sample-Efficient Deep Neural Network Mapping

📅 2025-07-22

📈 Citations: 0

✨ Influential: 0

career value

253K/year

🤖 AI Summary

Low sample efficiency and poor scalability of reinforcement learning (RL) in DNN hardware mapping optimization hinder its application to large, complex design spaces. Method: This paper proposes a decentralized multi-agent RL framework that integrates a correlation-driven agent clustering mechanism with a distributed parallel search strategy, effectively balancing exploration breadth while suppressing redundant training. The framework enables end-to-end automated mapping decisions, jointly optimizing latency, energy consumption, and resource utilization. Contribution/Results: Experiments demonstrate that the proposed method achieves 30–300× higher sample efficiency than single-agent RL baselines. Under identical sample budgets, it reduces latency by up to 32.61× and improves the energy-delay product (EDP) by up to 16.45×, thereby overcoming key efficiency bottlenecks in high-performance accelerator design.

Technology Category

Application Category

📝 Abstract

Mapping deep neural networks (DNNs) to hardware is critical for optimizing latency, energy consumption, and resource utilization, making it a cornerstone of high-performance accelerator design. Due to the vast and complex mapping space, reinforcement learning (RL) has emerged as a promising approach-but its effectiveness is often limited by sample inefficiency. We present a decentralized multi-agent reinforcement learning (MARL) framework designed to overcome this challenge. By distributing the search across multiple agents, our framework accelerates exploration. To avoid inefficiencies from training multiple agents in parallel, we introduce an agent clustering algorithm that assigns similar mapping parameters to the same agents based on correlation analysis. This enables a decentralized, parallelized learning process that significantly improves sample efficiency. Experimental results show our MARL approach improves sample efficiency by 30-300x over standard single-agent RL, achieving up to 32.61x latency reduction and 16.45x energy-delay product (EDP) reduction under iso-sample conditions.

Problem

Research questions and friction points this paper is trying to address.

Optimizing DNN hardware mapping for latency and energy efficiency

Addressing sample inefficiency in reinforcement learning approaches

Enabling decentralized parallel learning via agent clustering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized MARL framework for DNN mapping

Agent clustering algorithm for parameter assignment

Parallelized learning boosts sample efficiency significantly

🔎 Similar Papers

Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning