🤖 AI Summary
This work addresses real-time, high-accuracy pupil center prediction from event camera–acquired eye movement data, advancing the practical deployment of event-driven eye tracking. We systematically survey and benchmark top-performing methods from the Eye Tracking Challenge at the 2025 CVPR Workshop on Event-Based Vision. Our proposed solution is the first to jointly integrate spiking neural networks, spatiotemporal event stream encoding, a lightweight CNN-LSTM hybrid architecture, and low-latency online filtering—augmented by hardware-aware co-design principles. The resulting optimal model achieves Pareto-optimal trade-offs among accuracy (mean error < 2.1 pixels), model size (< 300K parameters), and computational cost (< 1.2 GOPS), significantly outperforming conventional frame-based approaches. This end-to-end solution enables low-power, high-responsiveness eye tracking with direct deployability on resource-constrained edge platforms.
📝 Abstract
This survey serves as a review for the 2025 Event-Based Eye Tracking Challenge organized as part of the 2025 CVPR event-based vision workshop. This challenge focuses on the task of predicting the pupil center by processing event camera recorded eye movement. We review and summarize the innovative methods from teams rank the top in the challenge to advance future event-based eye tracking research. In each method, accuracy, model size, and number of operations are reported. In this survey, we also discuss event-based eye tracking from the perspective of hardware design.