🤖 AI Summary
Traditional microsaccade research is constrained by high cost, low temporal resolution, and poor scalability of frame-based eye-trackers. To address these limitations, this work introduces the first event-driven benchmark for microsaccade recognition: we synthesize a novel, publicly available event-stream microsaccade dataset—featuring seven distinct angular displacement patterns—using high-fidelity Blender rendering and the v2e event camera simulator. We further propose Spiking-VGG16Flow, a spiking neural network (SNN) architecture enhanced with optical flow preprocessing, achieving approximately 90% mean classification accuracy on the dataset. This study constitutes the first systematic integration of event cameras and SNNs into microsaccade recognition, overcoming the spatiotemporal bottlenecks inherent to frame-based approaches. By establishing a new, biologically plausible benchmark for fine-grained oculomotor analysis, our work advances cognitive modeling of subtle eye movements.
📝 Abstract
Microsaccades are small, involuntary eye movements vital for visual perception and neural processing. Traditional microsaccade studies typically use eye trackers or frame-based analysis, which, while precise, are costly and limited in scalability and temporal resolution. Event-based sensing offers a high-speed, low-latency alternative by capturing fine-grained spatiotemporal changes efficiently. This work introduces a pioneering event-based microsaccade dataset to support research on small eye movement dynamics in cognitive computing. Using Blender, we render high-fidelity eye movement scenarios and simulate microsaccades with angular displacements from 0.5 to 2.0 degrees, divided into seven distinct classes. These are converted to event streams using v2e, preserving the natural temporal dynamics of microsaccades, with durations ranging from 0.25 ms to 2.25 ms. We evaluate the dataset using Spiking-VGG11, Spiking-VGG13, and Spiking-VGG16, and propose Spiking-VGG16Flow, an optical-flow-enhanced variant implemented in SpikingJelly. The models achieve around 90 percent average accuracy, successfully classifying microsaccades by angular displacement, independent of event count or duration. These results demonstrate the potential of spiking neural networks for fine motion recognition and establish a benchmark for event-based vision research. The dataset, code, and trained models will be publicly available at https://waseemshariff126.github.io/microsaccades/ .