π€ AI Summary
Edge AI faces critical challenges of high neural network weight storage overhead and weak security guarantees. To address these, this paper proposes WINGsβa novel framework that introduces a dynamic weight generation mechanism integrating Principal Component Analysis (PCA) for dimensionality reduction and lightweight Support Vector Regression (SVR), thereby eliminating explicit storage of fully connected layer weights. Additionally, leveraging sensitivity analysis, WINGs applies selective weight compression only to low-sensitivity layers in CNNs. This design jointly optimizes storage efficiency and robustness: it achieves 53Γ and 18Γ model compression on MNIST and CIFAR-10, respectively, significantly reducing memory footprint, improving inference throughput, lowering energy consumption, and incurring only 1β2% accuracy degradation. Moreover, WINGs enhances detection capability against adversarial weight tampering attacks. Overall, WINGs establishes a new paradigm for efficient and secure neural inference under stringent resource constraints.
π Abstract
Complex neural networks require substantial memory to store a large number of synaptic weights. This work introduces WINGs (Automatic Weight Generator for Secure and Storage-Efficient Deep Learning Models), a novel framework that dynamically generates layer weights in a fully connected neural network (FC) and compresses the weights in convolutional neural networks (CNNs) during inference, significantly reducing memory requirements without sacrificing accuracy. WINGs framework uses principal component analysis (PCA) for dimensionality reduction and lightweight support vector regression (SVR) models to predict layer weights in the FC networks, removing the need for storing full-weight matrices and achieving substantial memory savings. It also preferentially compresses the weights in low-sensitivity layers of CNNs using PCA and SVR with sensitivity analysis. The sensitivity-aware design also offers an added level of security, as any bit-flip attack with weights in compressed layers has an amplified and readily detectable effect on accuracy. WINGs achieves 53x compression for the FC layers and 28x for AlexNet with MNIST dataset, and 18x for Alexnet with CIFAR-10 dataset with 1-2% accuracy loss. This significant reduction in memory results in higher throughput and lower energy for DNN inference, making it attractive for resource-constrained edge applications.