Noisy Spiking Actor Network for Exploration

๐Ÿ“… 2024-03-07
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 2
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the exploration difficulty and excessive noise robustness arising from binary spiking mechanisms in Spiking Neural Networks (SNNs) for deep reinforcement learning, this paper proposes the Temporally Correlated Noise Spiking Action Network (TCN-SAN). The method introduces parameterized, time-domain noise into both the membrane potential accumulation and spike transmission processes of Leaky Integrate-and-Fire (LIF) neuronsโ€”marking the first such integration in SNN-based RL. A dynamically decaying noise schedule is further designed to jointly regularize policy stability, thereby balancing exploration efficiency and convergence stability. Evaluated on multiple continuous-control benchmarks in OpenAI Gym, TCN-SAN significantly outperforms existing SNN-based and mainstream deep RL exploration methods, achieving an average performance gain of 12.6% and accelerating policy convergence by 37%.

Technology Category

Application Category

๐Ÿ“ Abstract
As a general method for exploration in deep reinforcement learning (RL), NoisyNet can produce problem-specific exploration strategies. Spiking neural networks (SNNs), due to their binary firing mechanism, have strong robustness to noise, making it difficult to realize efficient exploration with local disturbances. To solve this exploration problem, we propose a noisy spiking actor network (NoisySAN) that introduces time-correlated noise during charging and transmission. Moreover, a noise reduction method is proposed to find a stable policy for the agent. Extensive experimental results demonstrate that our method outperforms the state-of-the-art performance on a wide range of continuous control tasks from OpenAI gym.
Problem

Research questions and friction points this paper is trying to address.

Explores efficient exploration in spiking neural networks for reinforcement learning
Introduces time-correlated noise to overcome robustness to local disturbances
Proposes a noise reduction method to achieve stable policy performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces time-correlated noise in spiking networks
Proposes noise reduction for stable policy learning
Outperforms state-of-the-art on continuous control tasks
๐Ÿ”Ž Similar Papers
No similar papers found.
Ding Chen
Ding Chen
Postdoctoral Scholar, University of Texas Southwestern Medical Center
P
Peixi Peng
Network Intelligence Research, PengCheng Laboratory, Shenzhen, China
Tiejun Huang
Tiejun Huang
Professor,School of Computer Science, Peking University
Visual Information Processing
Y
Yonghong Tian
Department of Computer Science and Technology, Peking University, Beijing, China