Egocentric Event-Based Vision for Ping Pong Ball Trajectory Prediction

๐Ÿ“… 2025-06-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Traditional frame-based cameras suffer from motion blur and high latency (>66 ms) in high-speed table tennis tracking. To address this, we propose the first egocentric event-driven method for real-time 3D trajectory prediction. Our approach fuses eye-tracking and IMU data from Meta Project Aria smart glasses to design a foveated event-stream processing paradigm, enabling resource-aware lightweight state estimation and physics-guided 3D trajectory modeling. Key contributions include: (i) the first integration of gaze guidance with event cameras for proactive table tennis trajectory prediction, and (ii) a novel foveated event-processing architecture aligned with human visual mechanisms. Experiments demonstrate an end-to-end worst-case latency of only 4.5 msโ€”14.7ร— lower than that of a 30 FPS frame cameraโ€”along with significantly reduced trajectory prediction error and a 10.81ร— decrease in computational load.

Technology Category

Application Category

๐Ÿ“ Abstract
In this paper, we present a real-time egocentric trajectory prediction system for table tennis using event cameras. Unlike standard cameras, which suffer from high latency and motion blur at fast ball speeds, event cameras provide higher temporal resolution, allowing more frequent state updates, greater robustness to outliers, and accurate trajectory predictions using just a short time window after the opponent's impact. We collect a dataset of ping-pong game sequences, including 3D ground-truth trajectories of the ball, synchronized with sensor data from the Meta Project Aria glasses and event streams. Our system leverages foveated vision, using eye-gaze data from the glasses to process only events in the viewer's fovea. This biologically inspired approach improves ball detection performance and significantly reduces computational latency, as it efficiently allocates resources to the most perceptually relevant regions, achieving a reduction factor of 10.81 on the collected trajectories. Our detection pipeline has a worst-case total latency of 4.5 ms, including computation and perception - significantly lower than a frame-based 30 FPS system, which, in the worst case, takes 66 ms solely for perception. Finally, we fit a trajectory prediction model to the estimated states of the ball, enabling 3D trajectory forecasting in the future. To the best of our knowledge, this is the first approach to predict table tennis trajectories from an egocentric perspective using event cameras.
Problem

Research questions and friction points this paper is trying to address.

Real-time trajectory prediction for table tennis using event cameras
Reducing computational latency with foveated vision and gaze data
First egocentric approach for ping pong trajectory forecasting
Innovation

Methods, ideas, or system contributions that make the work stand out.

Event cameras for high temporal resolution
Foveated vision reduces computational latency
3D trajectory prediction from egocentric view
๐Ÿ”Ž Similar Papers
No similar papers found.