RNM-TD3: N:M Semi-structured Sparse Reinforcement Learning From Scratch

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of reconciling hardware efficiency with performance preservation in sparse deep reinforcement learning. While unstructured sparsity lacks hardware support and structured sparsity often degrades performance, this study introduces row-wise N:M semi-structured sparsity into the off-policy TD3 algorithm for the first time, proposing an end-to-end trainable, hardware-aware sparse RL framework. By enforcing N:M sparsity constraints throughout training—without requiring post-processing—the method enables efficient learning even at high sparsity levels. Experiments on continuous control tasks such as Ant demonstrate a 14% performance gain under 2:4 sparsity, while maintaining competitive performance even at 87.5% sparsity (1:8). The approach is natively compatible with emerging hardware that supports N:M sparse computation, offering simultaneous benefits in model compression, performance improvement, and training acceleration.

Technology Category

Application Category

📝 Abstract
Sparsity is a well-studied technique for compressing deep neural networks (DNNs) without compromising performance. In deep reinforcement learning (DRL), neural networks with up to 5% of their original weights can still be trained with minimal performance loss compared to their dense counterparts. However, most existing methods rely on unstructured fine-grained sparsity, which limits hardware acceleration opportunities due to irregular computation patterns. Structured coarse-grained sparsity enables hardware acceleration, yet typically degrades performance and increases pruning complexity. In this work, we present, to the best of our knowledge, the first study on N:M structured sparsity in RL, which balances compression, performance, and hardware efficiency. Our framework enforces row-wise N:M sparsity throughout training for all networks in off-policy RL (TD3), maintaining compatibility with accelerators that support N:M sparse matrix operations. Experiments on continuous-control benchmarks show that RNM-TD3, our N:M sparse agent, outperforms its dense counterpart at 50%-75% sparsity (e.g., 2:4 and 1:4), achieving up to a 14% increase in performance at 2:4 sparsity on the Ant environment. RNM-TD3 remains competitive even at 87.5% sparsity (1:8), while enabling potential training speedups.
Problem

Research questions and friction points this paper is trying to address.

structured sparsity
deep reinforcement learning
hardware acceleration
model compression
N:M sparsity
Innovation

Methods, ideas, or system contributions that make the work stand out.

N:M sparsity
structured sparse reinforcement learning
hardware-efficient RL
RNM-TD3
sparse neural networks
🔎 Similar Papers
No similar papers found.
I
Isam Vrce
Deggendorf Institute of Technology, Dieter-Görlitz-Platz 1, Deggendorf, Germany
Andreas Kassler
Andreas Kassler
Karlstad University, Deggendorf Institute of Technology
Programmable NetworksNetwork ProgrammabilityNetwork VirtualizationSDN/NFVData Center
G
Gökçe Aydos
Department of Engineering Technology, Technical University of Denmark, Ballerup, Denmark