End-to-End Keyword Spotting on FPGA Using Graph Neural Networks with a Neuromorphic Auditory Sensor

📅 2026-05-10

📈 Citations: 0

✨ Influential: 0

career value

261K/year

🤖 AI Summary

This work addresses the high power consumption and redundant preprocessing inherent in keyword spotting on edge devices by proposing the first end-to-end FPGA system that directly processes event streams from neuromorphic auditory sensors. By integrating neuromorphic sensing, graph neural networks, compute-in-memory architecture, and quantized inference onto a single FPGA chip, the system enables real-time keyword recognition without conventional audio preprocessing. Evaluated on the Google Speech Commands v2 dataset, the system achieves an accuracy of 87.43% with an end-to-end latency below 35 microseconds and an average power consumption of only 1.12 watts, substantially improving both energy efficiency and response speed.

📝 Abstract

With the rapid growth of mobile robotics and embedded intelligence, there is an increasing demand for efficient on-device data processing on edge platforms. A promising research direction is the use of neuromorphic sensors inspired by human sensory systems, which generate sparse, event-based data encoding changes in the environment. In this work, we present the first end-to-end FPGA implementation of a keyword spotting system that integrates a Neuromorphic Auditory Sensor (NAS) and a graph neural network (GNN) on a single FPGA device, enabling real-time processing of raw audio data. The proposed architecture eliminates conventional signal preprocessing and operates directly on event-based audio streams. Leveraging a compute-near-memory network architecture, the system achieves efficient inference with low latency and low power consumption. Experimental results demonstrate an accuracy of 87.43% after quantization on the Google Speech Commands v2 dataset processed through the neuromorphic sensor, with end-to-end latency below 35 us and average power consumption of 1.12 W. The processed datasets, software models, and hardware modules are available at https://github.com/vision-agh/NAS-GNN-KWS.

Problem

Research questions and friction points this paper is trying to address.

Keyword Spotting

Neuromorphic Auditory Sensor

Edge Computing

Event-based Audio

On-device Processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neuromorphic Auditory Sensor

Graph Neural Network

FPGA