🤖 AI Summary
To address the challenges of resource constraints, high energy consumption, and privacy sensitivity in skin lesion classification on edge devices, this paper proposes QANA—a quantization-aware spiking neural network. QANA integrates Ghost modules for parameter efficiency, Squeeze-and-Excitation channel-wise attention for discriminative feature enhancement, and a spike-compatible quantization-aware training mechanism enabling end-to-end optimization and efficient deployment on neuromorphic hardware (e.g., BrainChip Akida). Evaluated on HAM10000 and a clinical dataset, QANA achieves 91.6% Top-1 accuracy and 82.4% macro-F1 score, respectively. On-device deployment yields 1.5 ms inference latency and 1.7 mJ per inference—reducing energy consumption by over 98.6% compared to GPU-based inference. To our knowledge, QANA is the first framework to jointly optimize accuracy, energy efficiency, and hardware deployability for edge dermatology diagnosis, establishing a new paradigm for lightweight, privacy-preserving skin lesion analysis.
📝 Abstract
Accurate and efficient skin lesion classification on edge devices is critical for accessible dermatological care but remains challenging due to computational, energy, and privacy constraints. We introduce QANA, a novel quantization-aware neuromorphic architecture for incremental skin lesion classification on resource-limited hardware. QANA effectively integrates ghost modules, efficient channel attention, and squeeze-and-excitation blocks for robust feature representation with low-latency and energy-efficient inference. Its quantization-aware head and spike-compatible transformations enable seamless conversion to spiking neural networks (SNNs) and deployment on neuromorphic platforms. Evaluation on the large-scale HAM10000 benchmark and a real-world clinical dataset shows that QANA achieves 91.6% Top-1 accuracy and 82.4% macro F1 on HAM10000, and 90.8% / 81.7% on the clinical dataset, significantly outperforming state-of-the-art CNN-to-SNN models under fair comparison. Deployed on BrainChip Akida hardware, QANA achieves 1.5,ms inference latency and 1.7,mJ energy per image, reducing inference latency and energy use by over 94.6%/98.6% compared to GPU-based CNNs surpassing state-of-the-art CNN-to-SNN conversion baselines. These results demonstrate the effectiveness of QANA for accurate, real-time, and privacy-sensitive medical analysis in edge environments.