π€ AI Summary
In high-speed table tennis scenarios, conventional vision systems suffer from motion blur, high latency, and data redundancy, while reinforcement learning approaches often exhibit low sample efficiency, hindering real-time precise perception and rapid decision-making. This work proposes a novel framework integrating neuromorphic event cameras with sample-efficient learning: it directly detects the ball from asynchronous event streams without frame reconstruction and introduces a human-inspired progressive reinforcement learning mechanism that leverages time-varying rewards and threshold-based control to enable skill transfer from slow to fast dynamics. Experiments demonstrate that, under identical training iterations, the proposed method improves ball-return placement accuracy by 35.8%, substantially overcoming the perception and decision-making bottlenecks of existing approaches in real-world high-speed settings.
π Abstract
Perception and decision-making in high-speed dynamic scenarios remain challenging for current robots. In contrast, humans and animals can rapidly perceive and make decisions in such environments. Taking table tennis as a typical example, conventional frame-based vision sensors suffer from motion blur, high latency and data redundancy, which can hardly meet real-time, accurate perception requirements. Inspired by the human visual system, event-based perception methods address these limitations through asynchronous sensing, high temporal resolution, and inherently sparse data representations. However, current event-based methods are still restricted to simplified, unrealistic ball-only scenarios. Meanwhile, existing decision-making approaches typically require thousands of interactions with the environment to converge, resulting in significant computational costs. In this work, we present a biologically inspired approach for high-speed table tennis robots, combining event-based perception with sample-efficient learning. On the perception side, we propose an event-based ball detection method that leverages motion cues and geometric consistency, operating directly on asynchronous event streams without frame reconstruction, to achieve robust and efficient detection in real-world rallies. On the decision-making side, we introduce a human-inspired, sample-efficient training strategy that first trains policies in low-speed scenarios, progressively acquiring skills from basic to advanced, and then adapts them to high-speed scenarios, guided by a case-dependent temporally adaptive reward and a reward-threshold mechanism. With the same training episodes, our method improves return-to-target accuracy by 35.8%. These results demonstrate the effectiveness of biologically inspired perception and decision-making for high-speed robotic systems.