🤖 AI Summary
Existing spiking neural networks (SNNs) for low-power AI in intelligent mobile agents (UGVs/UAVs) neglect in-memory computing (IMC) hardware constraints, leading to suboptimal trade-offs between accuracy and energy efficiency. Method: We propose the first hardware-aware SNN architecture search (NAS) framework, uniquely integrating quantization modeling, learnability analysis, and system-level IMC constraints (area, latency, energy) into the NAS pipeline. Our approach tightly couples event-driven SNN modeling, 8-bit weight quantization, RRAM compatibility, gradient approximation, and sparse training. Results: Evaluated on CIFAR-10, CIFAR-100, and TinyImageNet-200, the discovered models achieve high accuracy while satisfying all hardware constraints: search speed improves by 6.6×, chip area reduces by 92%, energy consumption drops by 84%, and latency decreases by 1.2×.
📝 Abstract
Intelligent mobile agents (e.g., UGVs and UAVs) typically demand low power/energy consumption when solving their machine learning (ML)-based tasks, since they are usually powered by portable batteries with limited capacity. A potential solution is employing neuromorphic computing with Spiking Neural Networks (SNNs), which leverages event-based computation to enable ultra-low power/energy ML algorithms. To maximize the performance efficiency of SNN inference, the In-Memory Computing (IMC)-based hardware accelerators with emerging device technologies (e.g., RRAM) can be employed. However, SNN models are typically developed without considering constraints from the application and the underlying IMC hardware, thereby hindering SNNs from reaching their full potential in performance and efficiency. To address this, we propose NeuroNAS, a novel framework for developing energyefficient neuromorphic IMC for intelligent mobile agents using hardware-aware spiking neural architecture search (NAS), i.e., by quickly finding an SNN architecture that offers high accuracy under the given constraints (e.g., memory, area, latency, and energy consumption). Its key steps include: optimizing SNN operations to enable efficient NAS, employing quantization to minimize the memory footprint, developing an SNN architecture that facilitates an effective learning, and devising a systematic hardware-aware search algorithm to meet the constraints. Compared to the state-of-the-art techniques, NeuroNAS quickly finds SNN architectures (with 8bit weight precision) that maintain high accuracy by up to 6.6x search time speed-ups, while achieving up to 92% area savings, 1.2x latency improvements, 84% energy savings across different datasets (i.e., CIFAR-10, CIFAR-100, and TinyImageNet-200); while the state-of-the-art fail to meet all constraints at once.