🤖 AI Summary
To address the high power consumption and low integration density bottlenecks in deploying deep learning models on edge devices, this work proposes the first memristor-native hardware implementation of a fully pipelined MobileNetV3 architecture—including convolution, batch normalization, activation, global average pooling, and fully connected layers. Our approach integrates custom memristor array circuit design, analog-domain neural network mapping, and end-to-end training-inference co-optimization for CIFAR-10. This synergy enables joint improvement in accuracy and energy efficiency. The prototype achieves 90.2% classification accuracy on CIFAR-10, with 3.2× lower inference latency and 87% reduced energy consumption compared to digital implementations at equivalent accuracy. Crucially, this is the first systematic demonstration of memristor-based hardware acceleration across *all* computational modules of a lightweight CNN, validating both feasibility and superiority over conventional digital accelerators.
📝 Abstract
The increasing computational demands of deep learning models pose significant challenges for edge devices. To address this, we propose a memristor-based circuit design for MobileNetV3, specifically for image classification tasks. Our design leverages the low power consumption and high integration density of memristors, making it suitable for edge computing. The architecture includes optimized memristive convolutional modules, batch normalization modules, activation function modules, global average pooling modules, and fully connected modules. Experimental results on the CIFAR-10 dataset show that our memristor-based MobileNetV3 achieves over 90% accuracy while significantly reducing inference time and energy consumption compared to traditional implementations. This work demonstrates the potential of memristor-based designs for efficient deployment of deep learning models in resource-constrained environments.