Maximizing Asynchronicity in Event-based Neural Networks

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Event cameras’ asynchronous, sparse, and high-temporal-resolution characteristics pose fundamental limitations for conventional asynchronous-to-synchronous (A2S) representation methods—namely, weak expressivity, poor generalization, and constrained real-time performance. To address these challenges, we propose EVA, an end-to-end asynchronous representation learning framework that introduces linear attention mechanisms and self-supervised language modeling into event-based learning for the first time. EVA employs a streaming encoder operating at the event level, enabling direct “event → vector” mapping without explicit synchronization preprocessing. This design simultaneously achieves high representational capacity, strong generalization, and low inference latency. On DVS128-Gesture and N-Cars classification benchmarks, EVA outperforms existing A2S approaches. Moreover, on the Gen1 object detection task, it achieves 47.7 mAP—marking the first substantive breakthrough of the A2S paradigm in a challenging, high-difficulty detection scenario.

Technology Category

Application Category

📝 Abstract

Event cameras deliver visual data with high temporal resolution, low latency, and minimal redundancy, yet their asynchronous, sparse sequential nature challenges standard tensor-based machine learning (ML). While the recent asynchronous-to-synchronous (A2S) paradigm aims to bridge this gap by asynchronously encoding events into learned representations for ML pipelines, existing A2S approaches often sacrifice representation expressivity and generalizability compared to dense, synchronous methods. This paper introduces EVA (EVent Asynchronous representation learning), a novel A2S framework to generate highly expressive and generalizable event-by-event representations. Inspired by the analogy between events and language, EVA uniquely adapts advances from language modeling in linear attention and self-supervised learning for its construction. In demonstration, EVA outperforms prior A2S methods on recognition tasks (DVS128-Gesture and N-Cars), and represents the first A2S framework to successfully master demanding detection tasks, achieving a remarkable 47.7 mAP on the Gen1 dataset. These results underscore EVA's transformative potential for advancing real-time event-based vision applications.

Problem

Research questions and friction points this paper is trying to address.

Bridging asynchronous event data with standard ML methods

Improving expressivity in event representation learning

Enhancing generalizability for event-based vision tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

EVA framework for event-by-event representation learning

Adapts language modeling techniques for event processing

Uses linear attention and self-supervised learning

🔎 Similar Papers

Overcoming the Limitations of Layer Synchronization in Spiking Neural Networks