Algorithm-hardware co-design of neuromorphic networks with dual memory pathways

📅 2025-12-08

📈 Citations: 0

✨ Influential: 0

career value

261K/year

🤖 AI Summary

To address the challenges of maintaining contextual information in spiking neural networks (SNNs) for long-sequence tasks—while simultaneously meeting stringent hardware energy-efficiency and memory-budget constraints—this paper proposes an algorithm-hardware co-design framework. Inspired by cortical fast-slow memory mechanisms, the core innovation is a dual-path architecture: an explicit slow-memory pathway ensures stable long-term state retention and event-driven sparsity, while a fast pathway handles transient perception. The design integrates low-dimensional state modeling, heterogeneous sparse dataflow scheduling, and near-memory computing hardware to enable efficient algorithm-hardware synergy. Experiments demonstrate state-of-the-art accuracy on long-sequence benchmarks, with 40–60% fewer parameters, 4.1× higher hardware throughput, and 5.3× improved energy efficiency compared to prior approaches.

Technology Category

Application Category

📝 Abstract

Spiking neural networks excel at event-driven sensing yet maintaining task-relevant context over long timescales. However building these networks in hardware respecting both tight energy and memory budgets, remains a core challenge in the field. We address this challenge through novel algorithm-hardware co-design effort. At the algorithm level, inspired by the cortical fast-slow organization in the brain, we introduce a neural network with an explicit slow memory pathway that, combined with fast spiking activity, enables a dual memory pathway (DMP) architecture in which each layer maintains a compact low-dimensional state that summarizes recent activity and modulates spiking dynamics. This explicit memory stabilizes learning while preserving event-driven sparsity, achieving competitive accuracy on long-sequence benchmarks with 40-60% fewer parameters than equivalent state-of-the-art spiking neural networks. At the hardware level, we introduce a near-memory-compute architecture that fully leverages the advantages of the DMP architecture by retaining its compact shared state while optimizing dataflow, across heterogeneous sparse-spike and dense-memory pathways. We show experimental results that demonstrate more than a 4x increase in throughput and over a 5x improvement in energy efficiency compared with state-of-the-art implementations. Together, these contributions demonstrate that biological principles can guide functional abstractions that are both algorithmically effective and hardware-efficient, establishing a scalable co-design paradigm for real-time neuromorphic computation and learning.

Problem

Research questions and friction points this paper is trying to address.

Designing energy-efficient neuromorphic hardware with long-term memory

Reducing parameters in spiking neural networks while maintaining accuracy

Improving throughput and energy efficiency in neuromorphic computing systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Algorithm-hardware co-design with dual memory pathways

Explicit slow memory pathway enabling compact state representation

Near-memory-compute architecture optimizing sparse-dense dataflow

🔎 Similar Papers

NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems