π€ AI Summary
This work addresses the challenge of efficiently exploring the vast design space of approximate deep neural networks (DNNs), where exhaustive search becomes intractable due to poor scalability and suboptimal trade-offs among accuracy, power, and area. To overcome this, the authors propose HAWX, a framework that introduces the first multi-granularity sensitivity analysis spanning operators, filters, layers, and entire models. HAWX integrates hardware-aware search algorithms with predictive models for accuracy, power, and area to guide the selective integration of heterogeneous approximate computing (AxC) modules. It supports spatial-temporal accelerator architectures and accommodates both off-the-shelf and custom approximate components, achieving exponentially increasing search efficiency with model scale. Experiments demonstrate 23Γ and 3 millionΓ speedups for layer- and filter-level searches on LeNet-5, respectively, matching exhaustive search accuracy, while also validating scalability on VGG-11, ResNet-18, and EfficientNetLite.
π Abstract
This work presents HAWX, a hardware-aware scalable exploration framework that employs multi-level sensitivity scoring at different DNN abstraction levels (operator, filter, layer, and model) to guide selective integration of heterogeneous AxC blocks. Supported by predictive models for accuracy, power, and area, HAWX accelerates the evaluation of candidate configurations, achieving over 23* speedup in a layer-level search with two candidate approximate blocks and more than (3*106)* speedup at the filter-level search only for LeNet-5, while maintaining accuracy comparable to exhaustive search. Experiments across state-of-the-art DNN benchmarks such as VGG-11, ResNet-18, and EfficientNetLite demonstrate that the efficiency benefits of HAWX scale exponentially with network size. The HAWX hardware-aware search algorithm supports both spatial and temporal accelerator architectures, leveraging either off-the-shelf approximate components or customized designs.