🤖 AI Summary
Existing SNN accelerators exploit only zero-value sparsity, overlooking the inherent structured pattern distribution in binary spike activations—limiting computational efficiency. This work pioneers the identification and modeling of such patterned distributions in spike activations, proposing Phi: a pattern-driven two-level sparse computing framework. At the vector level, Phi employs k-means clustering to extract predefined activation patterns and enables offline weight reuse; at the element level, it deploys complementary high-sparsity matrices to support fine-grained computation skipping. Through pattern-aware fine-tuning and algorithm-hardware co-design, Phi realizes a dedicated accelerator architecture. Experiments demonstrate that Phi achieves 3.45× speedup and 4.93× energy efficiency improvement over state-of-the-art SNN accelerators, breaking the performance ceiling imposed by conventional zero-value sparsity paradigms.
📝 Abstract
Spiking Neural Networks (SNNs) are gaining attention for their energy efficiency and biological plausibility, utilizing 0-1 activation sparsity through spike-driven computation. While existing SNN accelerators exploit this sparsity to skip zero computations, they often overlook the unique distribution patterns inherent in binary activations. In this work, we observe that particular patterns exist in spike activations, which we can utilize to reduce the substantial computation of SNN models. Based on these findings, we propose a novel extbf{pattern-based hierarchical sparsity} framework, termed extbf{ extit{Phi}}, to optimize computation. extit{Phi} introduces a two-level sparsity hierarchy: Level 1 exhibits vector-wise sparsity by representing activations with pre-defined patterns, allowing for offline pre-computation with weights and significantly reducing most runtime computation. Level 2 features element-wise sparsity by complementing the Level 1 matrix, using a highly sparse matrix to further reduce computation while maintaining accuracy. We present an algorithm-hardware co-design approach. Algorithmically, we employ a k-means-based pattern selection method to identify representative patterns and introduce a pattern-aware fine-tuning technique to enhance Level 2 sparsity. Architecturally, we design extbf{ extit{Phi}}, a dedicated hardware architecture that efficiently processes the two levels of extit{Phi} sparsity on the fly. Extensive experiments demonstrate that extit{Phi} achieves a $3.45 imes$ speedup and a $4.93 imes$ improvement in energy efficiency compared to state-of-the-art SNN accelerators, showcasing the effectiveness of our framework in optimizing SNN computation.