Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address training instability in spiking neural networks (SNNs) applied to continuous-control reinforcement learning (RL), arising from the mismatch between SNNs’ discrete-time dynamics and the soft-update mechanism of target networks, this paper proposes the Proxy Target Framework. It introduces a continuously differentiable auxiliary target network—replacing the conventional non-differentiable SNN target network—to enable smooth parameter updates. By integrating gradient approximation techniques with standard RL algorithms (e.g., SAC or TD3), the framework supports end-to-end training without incurring additional inference overhead. Crucially, it is the first method to enable SNNs built from simple leaky integrate-and-fire (LIF) neurons to outperform comparably sized artificial neural networks (ANNs) across multiple continuous-control benchmarks, achieving up to a 32% performance gain. This demonstrates the viability of energy-efficient neuromorphic agents for complex continuous-control tasks.

Technology Category

Application Category

📝 Abstract
Spiking Neural Networks (SNNs) offer low-latency and energy-efficient decision making through neuromorphic hardware, making them compelling for Reinforcement Learning (RL) in resource-constrained edge devices. Recent studies in this field directly replace Artificial Neural Networks (ANNs) by SNNs in existing RL frameworks, overlooking whether the RL algorithm is suitable for SNNs. However, most RL algorithms in continuous control are designed tailored to ANNs, including the target network soft updates mechanism, which conflict with the discrete, non-differentiable dynamics of SNN spikes. We identify that this mismatch destabilizes SNN training in continuous control tasks. To bridge this gap between discrete SNN and continuous control, we propose a novel proxy target framework. The continuous and differentiable dynamics of the proxy target enable smooth updates, bypassing the incompatibility of SNN spikes, stabilizing the RL algorithms. Since the proxy network operates only during training, the SNN retains its energy efficiency during deployment without inference overhead. Extensive experiments on continuous control benchmarks demonstrate that compared to vanilla SNNs, the proxy target framework enables SNNs to achieve up to 32% higher performance across different spiking neurons. Notably, we are the first to surpass ANN performance in continuous control with simple Leaky-Integrate-and-Fire (LIF) neurons. This work motivates a new class of SNN-friendly RL algorithms tailored to SNN's characteristics, paving the way for neuromorphic agents that combine high performance with low power consumption.
Problem

Research questions and friction points this paper is trying to address.

Mismatch between discrete SNNs and continuous RL control algorithms
Incompatibility of SNN spikes with target network soft updates
Instability in SNN training for continuous control tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proxy target bridges SNN and continuous control
Differentiable proxy enables smooth RL updates
Proxy operates only during training, no inference overhead
🔎 Similar Papers
No similar papers found.