🤖 AI Summary
Achieving fast, robust, and continuous kicking with humanoid football robots entails significant challenges: high-speed leg swinging, single-leg stance stability, and resilience to perception noise and external disturbances (e.g., opponent interactions). This paper proposes a four-stage teacher–student training framework integrating realistic perception noise modeling, hierarchical policy distillation, and online constrained reinforcement learning to substantially narrow the sim-to-real gap. Innovatively, we design a hierarchical reward function and whole-body motion constraints to jointly optimize kicking speed and postural stability. The approach achieves high-precision shooting and consistent scoring in both simulation and on the real-world H1 humanoid robot platform. Ablation studies quantitatively validate the critical contributions of each component—particularly to robustness, adaptability, and generalization across dynamic environments.
📝 Abstract
Learning fast and robust ball-kicking skills is a critical capability for humanoid soccer robots, yet it remains a challenging problem due to the need for rapid leg swings, postural stability on a single support foot, and robustness under noisy sensory input and external perturbations (e.g., opponents). This paper presents a reinforcement learning (RL)-based system that enables humanoid robots to execute robust continual ball-kicking with adaptability to different ball-goal configurations. The system extends a typical teacher-student training framework -- in which a "teacher" policy is trained with ground truth state information and the "student" learns to mimic it with noisy, imperfect sensing -- by including four training stages: (1) long-distance ball chasing (teacher); (2) directional kicking (teacher); (3) teacher policy distillation (student); and (4) student adaptation and refinement (student). Key design elements -- including tailored reward functions, realistic noise modeling, and online constrained RL for adaptation and refinement -- are critical for closing the sim-to-real gap and sustaining performance under perceptual uncertainty. Extensive evaluations in both simulation and on a real robot demonstrate strong kicking accuracy and goal-scoring success across diverse ball-goal configurations. Ablation studies further highlight the necessity of the constrained RL, noise modeling, and the adaptation stage. This work presents a system for learning robust continual humanoid ball-kicking under imperfect perception, establishing a benchmark task for visuomotor skill learning in humanoid whole-body control.