🤖 AI Summary
This work addresses the inefficiency of existing 3D convolutional neural networks on edge devices, which stems from their inability to dynamically adjust computation based on input content. To overcome this limitation, the authors propose DANCE, a novel framework that introduces input-aware dynamic pruning across three dimensions—frames, channels, and features—within 3D CNNs for the first time. DANCE enhances neuronal response diversity through Activation Variability Amplification (AVA) and incorporates fine-grained sparsity within convolutional layers via a lightweight controller coupled with Adaptive Activation Pruning (AAP). Experiments demonstrate that DANCE achieves speedups of 1.37× and 2.22× on Jetson Nano and Snapdragon 8 Gen 1 platforms, respectively, while improving energy efficiency by up to 1.47× and significantly reducing both MAC operations and memory access overhead.
📝 Abstract
Modern convolutional neural networks (CNNs) are workhorses for video and image processing, but fail to adapt to the computational complexity of input samples in a dynamic manner to minimize energy consumption. In this research, we propose DANCE, a fine-grained, input-aware, dynamic pruning framework for 3D CNNs to maximize power efficiency with negligible to zero impact on performance. In the proposed two-step approach, the first step is called activation variability amplification (AVA), and the 3D CNN model is retrained to increase the variance of the magnitude of neuron activations across the network in this step, facilitating pruning decisions across diverse CNN input scenarios. In the second step, called adaptive activation pruning (AAP), a lightweight activation controller network is trained to dynamically prune frames, channels, and features of 3D convolutional layers of the network (different for each layer), based on statistics of the outputs of the first layer of the network. Our method achieves substantial savings in multiply-accumulate (MAC) operations and memory accesses by introducing sparsity within convolutional layers. Hardware validation on the NVIDIA Jetson Nano GPU and the Qualcomm Snapdragon 8 Gen 1 platform demonstrates respective speedups of 1.37X and 2.22X, achieving up to 1.47X higher energy efficiency compared to the state of the art.