🤖 AI Summary
To address the trade-off between depth and efficiency of CNNs in resource-constrained embedded systems, this paper proposes Double-Win NAS—a novel paradigm that jointly optimizes deep neural architecture search and shallow structural equivalence transformation. Methodologically, it integrates transformable neural architecture search (T-NAS), mixed-precision training, resolution-agnostic elastic training, and end-to-end embedded deployment optimization. Its key innovations are: (i) a deep-to-shallow equivalent structural transformation mechanism that preserves inference accuracy while reducing computational complexity; and (ii) a dual-elastic training strategy jointly optimizing both accuracy and input resolution. Evaluated on Jetson AGX Xavier and Nano platforms using ImageNet and ImageNet-100, Double-Win NAS achieves 2.3× higher inference throughput and 3.1× better energy efficiency over state-of-the-art NAS methods—without any accuracy loss—thus effectively balancing model accuracy and hardware deployment efficiency.
📝 Abstract
Thanks to the evolving network depth, convolutional neural networks (CNNs) have achieved remarkable success across various embedded scenarios, paving the way for ubiquitous embedded intelligence. Despite its promise, the evolving network depth comes at the cost of degraded hardware efficiency. In contrast to deep networks, shallow networks can deliver superior hardware efficiency but often suffer from inferior accuracy. To address this dilemma, we propose Double-Win NAS, a novel deep-to-shallow transformable neural architecture search (NAS) paradigm tailored for resource-constrained intelligent embedded systems. Specifically, Double-Win NAS strives to automatically explore deep networks to first win strong accuracy, which are then equivalently transformed into their shallow counterparts to further win strong hardware efficiency. In addition to search, we also propose two enhanced training techniques, including hybrid transformable training towards better training accuracy and arbitrary-resolution elastic training towards enabling natural network elasticity across arbitrary input resolutions. Extensive experimental results on two popular intelligent embedded systems (i.e., NVIDIA Jetson AGX Xavier and NVIDIA Jetson Nano) and two representative large-scale datasets (i.e., ImageNet and ImageNet-100) clearly demonstrate the superiority of Double-Win NAS over previous state-of-the-art NAS approaches.