🤖 AI Summary
This work addresses the challenges of training instability and reward conflicts arising from tight perception-action coupling in humanoid robot soccer tasks. The authors propose the PAiD framework, which employs a three-stage progressive learning strategy: first, acquiring fundamental kicking motions from human motion capture; second, introducing a lightweight perception-action fusion module to enable position generalization; and third, incorporating physics-aware sim-to-real transfer to bridge the domain gap. This approach effectively decouples perception and control, circumventing objective conflicts inherent in end-to-end training. Evaluated on the Unitree G1 platform, the method achieves high-fidelity, robust humanoid kicking capabilities across diverse scenarios—including static and rolling balls, multiple starting positions, and external perturbations—while maintaining consistent performance in both indoor and outdoor environments.
📝 Abstract
Soccer presents a significant challenge for humanoid robots, demanding tightly integrated perception-action capabilities for tasks like perception-guided kicking and whole-body balance control. Existing approaches suffer from inter-module instability in modular pipelines or conflicting training objectives in end-to-end frameworks. We propose Perception-Action integrated Decision-making (PAiD), a progressive architecture that decomposes soccer skill acquisition into three stages: motion-skill acquisition via human motion tracking, lightweight perception-action integration for positional generalization, and physics-aware sim-to-real transfer. This staged decomposition establishes stable foundational skills, avoids reward conflicts during perception integration, and minimizes sim-to-real gaps. Experiments on the Unitree G1 demonstrate high-fidelity human-like kicking with robust performance under diverse conditions-including static or rolling balls, various positions, and disturbances-while maintaining consistent execution across indoor and outdoor scenarios. Our divide-and-conquer strategy advances robust humanoid soccer capabilities and offers a scalable framework for complex embodied skill acquisition. The project page is available at https://soccer-humanoid.github.io/.