π€ AI Summary
This work addresses the challenge of safely and stably traversing platforms higher than leg lengthβa task where existing humanoid robots often resort to high-impact, torque-limited jumping strategies. The authors propose APEX, a system that perceives local terrain geometry and adaptively composes climbing, walking, crawling, and posture adjustment skills to achieve high-platform traversal. A novel velocity-independent, general-purpose ratchet progress reward provides dense supervision signals, enabling distillation of six distinct skills into a single unified policy capable of smooth, context- and command-driven behavior transitions. Trained via deep reinforcement learning, the LiDAR-based whole-body controller incorporates map artifact modeling in simulation and elevation map repair during deployment, effectively bridging the Sim2Real gap in perception. Validated on the Unitree G1 robot, the approach achieves zero-shot transfer to 0.8-meter-high platforms (114% of leg length), demonstrating robust initial pose adaptability and seamless multi-skill coordination.
π Abstract
Humanoid locomotion has advanced rapidly with deep reinforcement learning (DRL), enabling robust feet-based traversal over uneven terrain. Yet platforms beyond leg length remain largely out of reach because current RL training paradigms often converge to jumping-like solutions that are high-impact, torque-limited, and unsafe for real-world deployment. To address this gap, we propose APEX, a system for perceptive, climbing-based high-platform traversal that composes terrain-conditioned behaviors: climb-up and climb-down at vertical edges, walking or crawling on the platform, and stand-up and lie-down for posture reconfiguration. Central to our approach is a generalized ratchet progress reward for learning contact-rich, goal-reaching maneuvers. It tracks the best-so-far task progress and penalizes non-improving steps, providing dense yet velocity-free supervision that enables efficient exploration under strong safety regularization. Based on this formulation, we train LiDAR-based full-body maneuver policies and reduce the sim-to-real perception gap through a dual strategy: modeling mapping artifacts during training and applying filtering and inpainting to elevation maps during deployment. Finally, we distill all six skills into a single policy that autonomously selects behaviors and transitions based on local geometry and commands. Experiments on a 29-DoF Unitree G1 humanoid demonstrate zero-shot sim-to-real traversal of 0.8 meter platforms (approximately 114% of leg length), with robust adaptation to platform height and initial pose, as well as smooth and stable multi-skill transitions.