🤖 AI Summary
To address the low inference efficiency and poor energy efficiency of real-time video denoising on edge devices deployed on Compute-in-Memory (CIM) architectures, this paper proposes a hardware-algorithm co-design framework. We introduce CIM-NET, the first CIM-aware deep neural network, and CIM-CONV, a novel pseudo-convolutional operator that jointly leverages sliding-window decomposition and fully connected transformations to exploit the massive parallelism of matrix-vector multiplication (MVM) in CIM crossbar arrays. Compared to FastDVDnet at stride=8, our method reduces MVM operations by 98.7% (to 1/77), with only a marginal PSNR degradation of 0.45 dB (achieving 35.11 dB), while significantly improving inference speed and energy efficiency. This work is the first to systematically resolve the architectural mismatch between DNN models and CIM hardware constraints, delivering a lightweight, efficient, and deployable solution for CIM-accelerated edge video processing.
📝 Abstract
While deep neural network (DNN)-based video denoising has demonstrated significant performance, deploying state-of-the-art models on edge devices remains challenging due to stringent real-time and energy efficiency requirements. Computing-in-Memory (CIM) chips offer a promising solution by integrating computation within memory cells, enabling rapid matrix-vector multiplication (MVM). However, existing DNN models are often designed without considering CIM architectural constraints, thus limiting their acceleration potential during inference. To address this, we propose a hardware-algorithm co-design framework incorporating two innovations: (1) a CIM-Aware Architecture, CIM-NET, optimized for large receptive field operation and CIM's crossbar-based MVM acceleration; and (2) a pseudo-convolutional operator, CIM-CONV, used within CIM-NET to integrate slide-based processing with fully connected transformations for high-quality feature extraction and reconstruction. This framework significantly reduces the number of MVM operations, improving inference speed on CIM chips while maintaining competitive performance. Experimental results indicate that, compared to the conventional lightweight model FastDVDnet, CIM-NET substantially reduces MVM operations with a slight decrease in denoising performance. With a stride value of 8, CIM-NET reduces MVM operations to 1/77th of the original, while maintaining competitive PSNR (35.11 dB vs. 35.56 dB