🤖 AI Summary
This work addresses the challenge of achieving robust, high-speed locomotion for humanoid robots in complex, unstructured environments, where reliance on proprioception alone is insufficient and incorporating exteroception often introduces state estimation drift and poor training scalability. To overcome these limitations, the authors propose an end-to-end perception-to-action framework that directly maps raw depth images and proprioceptive inputs to joint actions without explicit state estimation. The approach integrates terrain edge detection, a foot-volume-based foothold safety mechanism, and a flat-region sampling strategy to enhance training stability and deployment safety. Implemented within a single-stage reinforcement learning architecture, the method enables a full-scale humanoid robot to traverse challenging terrains at speeds up to 2.5 m/s. The code is open-sourced and supports real-world deployment with minimal hardware modifications.
📝 Abstract
Achieving robust humanoid hiking in complex, unstructured environments requires transitioning from reactive proprioception to proactive perception. However, integrating exteroception remains a significant challenge: mapping-based methods suffer from state estimation drift; for instance, LiDAR-based methods do not handle torso jitter well. Existing end-to-end approaches often struggle with scalability and training complexity; specifically, some previous works using virtual obstacles are implemented case-by-case. In this work, we present \textit{Hiking in the Wild}, a scalable, end-to-end parkour perceptive framework designed for robust humanoid hiking. To ensure safety and training stability, we introduce two key mechanisms: a foothold safety mechanism combining scalable \textit{Terrain Edge Detection} with \textit{Foot Volume Points} to prevent catastrophic slippage on edges, and a \textit{Flat Patch Sampling} strategy that mitigates reward hacking by generating feasible navigation targets. Our approach utilizes a single-stage reinforcement learning scheme, mapping raw depth inputs and proprioception directly to joint actions, without relying on external state estimation. Extensive field experiments on a full-size humanoid demonstrate that our policy enables robust traversal of complex terrains at speeds up to 2.5 m/s. The training and deployment code is open-sourced to facilitate reproducible research and deployment on real robots with minimal hardware modifications.