🤖 AI Summary
Conventional CNNs suffer from high computational cost and energy consumption, hindering their deployment on resource-constrained edge devices. Method: This work investigates the joint impact of embedding signal-processing transforms—specifically Fast Fourier Transform (FFT), Walsh–Hadamard Transform (WHT), and Discrete Cosine Transform (DCT)—into ResNet50, with a focus on computational efficiency, energy consumption, and classification accuracy. We propose a novel multi-layer collaborative embedding paradigm for WHT within CNNs—the first systematic exploration of its cross-layer synergistic gain mechanism. Contribution/Results: Evaluated on CIFAR-100 using a unified energy–accuracy assessment framework, the WHT-enhanced ResNet50 achieves a significant accuracy improvement—from 66.0% to 79.2%—while reducing average per-model energy consumption from 25.6 MJ to 39 kJ (>99.8% reduction). Crucially, this approach preserves architectural compatibility with standard ResNet50, enabling a breakthrough trade-off between high accuracy and ultra-low power consumption, thereby offering a scalable pathway for efficient edge vision model design.
📝 Abstract
This study investigates the integration of signal processing transformations -- Fast Fourier Transform (FFT), Walsh-Hadamard Transform (WHT), and Discrete Cosine Transform (DCT) -- within the ResNet50 convolutional neural network (CNN) model for image classification. The primary objective is to assess the trade-offs between computational efficiency, energy consumption, and classification accuracy during training and inference. Using the CIFAR-100 dataset (100 classes, 60,000 images), experiments demonstrated that incorporating WHT significantly reduced energy consumption while improving accuracy. Specifically, a baseline ResNet50 model achieved a testing accuracy of 66%, consuming an average of 25,606 kJ per model. In contrast, a modified ResNet50 incorporating WHT in the early convolutional layers achieved 74% accuracy, and an enhanced version with WHT applied to both early and late layers achieved 79% accuracy, with an average energy consumption of only 39 kJ per model. These results demonstrate the potential of WHT as a highly efficient and effective approach for energy-constrained CNN applications.