๐ค AI Summary
Humanoid robots face significant challenges in racket-based sports, including tight coupling between perception and action, difficulty in visual tracking and trajectory prediction under stringent time constraints, and the need for whole-body coordinated control. This work proposes an end-to-end hierarchical framework that integrates onboard visual perception, physics-informed trajectory prediction, and a large-scale pretrained whole-body controller (SONIC). For the first time, it enables real-time tennis striking using only onboard sensingโwithout reliance on external motion capture systems. The approach eliminates the need to retrain low-level policies when adapting to new tasks or platforms, and demonstrates successful vision-guided hitting in real-world experiments on the Unitree G1 humanoid, validating its effectiveness and practicality.
๐ Abstract
Dynamic ball-interaction tasks remain challenging for robots because they require tight perception-action coupling under limited reaction time. This challenge is especially pronounced in humanoid racket sports, where successful interception depends on accurate visual tracking, trajectory prediction, coordinated stepping, and stable whole-body striking. Existing robotic racket-sport systems often rely on external motion capture for state estimation or on task-specific low-level controllers that must be retrained across tasks and platforms. We present CyboRacket, a hierarchical perception-to-action framework for humanoid racket sports that integrates onboard visual perception, physics-based trajectory prediction, and large-scale pre-trained whole-body control. The framework uses onboard cameras to track the incoming object, predicts its future trajectory, and converts the estimated interception state into target end-effector and base-motion commands for whole-body execution by SONIC on the Unitree G1 humanoid robot. We evaluate the proposed framework in a vision-based humanoid tennis-hitting task. Experimental results demonstrate real-time visual tracking, trajectory prediction, and successful striking using purely onboard sensing.