Catwalk: Unary Top-K for Efficient Ramp-No-Leak Neuron Design for Temporal Neural Networks

📅 2025-08-28

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Existing CMOS-based SRM neurons (e.g., SRM0-RNL) assume that all dendritic inputs carry spikes every cycle, ignoring the inherent sparsity of spike trains—leading to significant hardware resource inefficiency. This work proposes Catwalk, the first neuron architecture to explicitly exploit spike sparsity as a computational optimization opportunity. Catwalk employs a unary top-k mechanism to dynamically rank and cluster incoming spikes, then relocates only the most salient inputs to parallel accumulation pathways—thereby eliminating redundant response computations. Built upon the RNL response function, it integrates the SRM formalism with a custom parallel counter, fully implementable in standard CMOS technology. Post-layout synthesis results demonstrate that, while preserving full functional equivalence to SRM0-RNL, Catwalk reduces silicon area by 28.8% and dynamic power consumption by 46.2%.

Technology Category

Application Category

📝 Abstract

Temporal neural networks (TNNs) are neuromorphic neural networks that utilize bit-serial temporal coding. TNNs are composed of columns, which in turn employ neurons as their building blocks. Each neuron processes volleys of input spikes, modulated by associated synaptic weights, on its dendritic inputs. Recently proposed neuron implementation in CMOS employs a Spike Response Model (SRM) with a ramp-no-leak (RNL) response function and assumes all the inputs can carry spikes. However, in actual spike volleys, only a small subset of the dendritic inputs actually carry spikes in each compute cycle. This form of sparsity can be exploited to achieve better hardware efficiency. In this paper, we propose a Catwalk neuron implementation by relocating spikes in a spike volley as a sorted subset cluster via unary top-k. Such relocation can significantly reduce the cost of the subsequent parallel counter (PC) for accumulating the response functions from the spiking inputs. This can lead to improvements on area and power efficiency in RNL neuron implementation. Place-and-route results show Catwalk is 1.39x and 1.86x better in area and power, respectively, as compared to existing SRM0-RNL neurons.

Problem

Research questions and friction points this paper is trying to address.

Optimizing neuron design for temporal neural networks efficiency

Exploiting input sparsity in spike volleys for hardware improvements

Reducing area and power costs in RNL neuron implementation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unary top-k sorting for spike relocation

Reduced parallel counter cost for efficiency

Improved area and power in neuron design

🔎 Similar Papers

No similar papers found.