π€ AI Summary
Existing GNN hardware accelerators struggle to simultaneously address the energy-efficiency and latency challenges posed by the sparsity of event streams and the irregularity of graph structures in edge vision tasks for event cameras. This work proposes the first dedicated hardware accelerator for event-driven Graph Neural Networks (GNNs). It introduces a dynamic directed graph representation to model asynchronous spatiotemporal dependencies among events; employs an event-queue-driven, spatiotemporally decoupled neighborhood search to reduce neighbor discovery overhead; and designs a multi-layer pipelined parallel GNN architecture enabling co-optimization of computation and memory access. Implemented on a Xilinx KV260 MPSoC, the accelerator achieves 87.8% classification accuracy on the N-CARS dataset with an average per-event processing latency of only 16 ΞΌsβmarking the first demonstration of microsecond-level, real-time, high-accuracy edge event-based visual inference.
π Abstract
Edge vision systems combining sensing and embedded processing promise low-latency, decentralized, and energy-efficient solutions that forgo reliance on the cloud. As opposed to conventional frame-based vision sensors, event-based cameras deliver a microsecond-scale temporal resolution with sparse information encoding, thereby outlining new opportunities for edge vision systems. However, mainstream algorithms for frame-based vision, which mostly rely on convolutional neural networks (CNNs), can hardly exploit the advantages of event-based vision as they are typically optimized for dense matrix-vector multiplications. While event-driven graph neural networks (GNNs) have recently emerged as a promising solution for sparse event-based vision, their irregular structure is a challenge that currently hinders the design of efficient hardware accelerators. In this paper, we propose EvGNN, the first event-driven GNN accelerator for low-footprint, ultra-low-latency, and high-accuracy edge vision with event-based cameras. It relies on three central ideas: (i) directed dynamic graphs exploiting single-hop nodes with edge-free storage, (ii) event queues for the efficient identification of local neighbors within a spatiotemporally decoupled search range, and (iii) a novel layer-parallel processing scheme allowing for a low-latency execution of multi-layer GNNs. We deployed EvGNN on a Xilinx KV260 Ultrascale+ MPSoC platform and benchmarked it on the N-CARS dataset for car recognition, demonstrating a classification accuracy of 87.8% and an average latency per event of 16$mu$s, thereby enabling real-time, microsecond-resolution event-based vision at the edge.