CADC: Crossbar-Aware Dendritic Convolution for Efficient In-memory Computing

📅 2025-11-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In crossbar-based in-memory computing (IMC) accelerators for CNNs, large convolutional layers are typically partitioned, generating excessive partial sums (psums), which incur substantial overheads in buffering, data transfer, and accumulation—especially exacerbating ADC noise accumulation. To address this, we propose Crossbar-Aware Dendritic Convolution (CADC), the first IMC architecture to incorporate neuroscientifically inspired dendritic nonlinear computation. CADC natively embeds sparse activation during computation, enabling dynamic psum compression and skipping. Integrated with SRAM-based IMC, zero-value compression/skipping, and low-precision quantization, CADC achieves 54%–88% psum reduction across multiple models and datasets. It reduces buffering and data-transfer overhead by 29.3%, accumulation overhead by 47.9%, and improves energy efficiency by up to 22.9×, while incurring negligible accuracy loss.

Technology Category

Application Category

📝 Abstract
Convolutional neural networks (CNNs) are computationally intensive and often accelerated using crossbar-based in-memory computing (IMC) architectures. However, large convolutional layers must be partitioned across multiple crossbars, generating numerous partial sums (psums) that require additional buffer, transfer, and accumulation, thus introducing significant system-level overhead. Inspired by dendritic computing principles from neuroscience, we propose crossbar-aware dendritic convolution (CADC), a novel approach that dramatically increases sparsity in psums by embedding a nonlinear dendritic function (zeroing negative values) directly within crossbar computations. Experimental results demonstrate that CADC significantly reduces psums, eliminating 80% in LeNet-5 on MNIST, 54% in ResNet-18 on CIFAR-10, 66% in VGG-16 on CIFAR-100, and up to 88% in spiking neural networks (SNN) on the DVS Gesture dataset. The induced sparsity from CADC provides two key benefits: (1) enabling zero-compression and zero-skipping, thus reducing buffer and transfer overhead by 29.3% and accumulation overhead by 47.9%; (2) minimizing ADC quantization noise accumulation, resulting in small accuracy degradation - only 0.01% for LeNet-5, 0.1% for ResNet-18, 0.5% for VGG-16, and 0.9% for SNN. Compared to vanilla convolution (vConv), CADC exhibits accuracy changes ranging from +0.11% to +0.19% for LeNet-5, -0.04% to -0.27% for ResNet-18, +0.99% to +1.60% for VGG-16, and -0.57% to +1.32% for SNN, across crossbar sizes from 64x64 to 256x256. Ultimately, a SRAM-based IMC implementation of CADC achieves 2.15 TOPS and 40.8 TOPS/W for ResNet-18 (4/2/4b), realizing an 11x-18x speedup and 1.9x-22.9x improvement in energy efficiency compared to existing IMC accelerators.
Problem

Research questions and friction points this paper is trying to address.

Reduces partial sums overhead in crossbar-based in-memory computing for CNNs.
Introduces dendritic nonlinearity to increase sparsity and minimize system-level costs.
Enhances efficiency and accuracy in neural network acceleration across various architectures.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Embedding nonlinear dendritic function within crossbar computations
Increasing sparsity in partial sums to reduce overhead
Enabling zero-compression and zero-skipping for efficiency
🔎 Similar Papers
No similar papers found.
S
Shuai Dong
Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
J
Junyi Yang
Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
Y
Ye Ke
Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
H
Hongyang Shang
Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
Arindam Basu
Arindam Basu
Professor, City University of Hong Kong (past Associate Professor of EEE at NTU)
NeuromorphicAnalog ICNeuromorphic EngineeringComputing-In-MemoryBrain-machine interface