Noisy Spiking Actor Network for Exploration

📅 2024-03-07

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

254K/year

🤖 AI Summary

To address the exploration difficulty and excessive noise robustness arising from binary spiking mechanisms in Spiking Neural Networks (SNNs) for deep reinforcement learning, this paper proposes the Temporally Correlated Noise Spiking Action Network (TCN-SAN). The method introduces parameterized, time-domain noise into both the membrane potential accumulation and spike transmission processes of Leaky Integrate-and-Fire (LIF) neurons—marking the first such integration in SNN-based RL. A dynamically decaying noise schedule is further designed to jointly regularize policy stability, thereby balancing exploration efficiency and convergence stability. Evaluated on multiple continuous-control benchmarks in OpenAI Gym, TCN-SAN significantly outperforms existing SNN-based and mainstream deep RL exploration methods, achieving an average performance gain of 12.6% and accelerating policy convergence by 37%.

Technology Category

Application Category

📝 Abstract

As a general method for exploration in deep reinforcement learning (RL), NoisyNet can produce problem-specific exploration strategies. Spiking neural networks (SNNs), due to their binary firing mechanism, have strong robustness to noise, making it difficult to realize efficient exploration with local disturbances. To solve this exploration problem, we propose a noisy spiking actor network (NoisySAN) that introduces time-correlated noise during charging and transmission. Moreover, a noise reduction method is proposed to find a stable policy for the agent. Extensive experimental results demonstrate that our method outperforms the state-of-the-art performance on a wide range of continuous control tasks from OpenAI gym.

Problem

Research questions and friction points this paper is trying to address.

Explores efficient exploration in spiking neural networks for reinforcement learning

Introduces time-correlated noise to overcome robustness to local disturbances

Proposes a noise reduction method to achieve stable policy performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces time-correlated noise in spiking networks

Proposes noise reduction for stable policy learning

Outperforms state-of-the-art on continuous control tasks

🔎 Similar Papers

No similar papers found.