Contrastive Representation Learning for Robust Sim-to-Real Transfer of Adaptive Humanoid Locomotion

📅 2025-09-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pure proprioceptive locomotion policies for humanoid robots lack environmental foresight, whereas perception-driven systems are vulnerable to sensor noise and environmental disturbances. Method: We propose a “distilled perception” framework that leverages contrastive representation learning to encode implicit terrain and dynamics information—acquired in simulation—into a latent space, thereby endowing proprioceptive policies with environmental anticipation capability. This latent representation drives an adaptive gait clock, reconciling the trade-off between rigid rhythmic control and clock-free strategies. The approach integrates privileged information distillation, latent-variable modeling, and reinforcement learning to enable zero-shot sim-to-real transfer. Results: Evaluated on a full-scale humanoid robot, the method achieves robust traversal of 30 cm steps and 26.5° inclines without real-world sensory input, demonstrating strong robustness and generalization across challenging unstructured terrains.

Technology Category

Application Category

📝 Abstract
Reinforcement learning has produced remarkable advances in humanoid locomotion, yet a fundamental dilemma persists for real-world deployment: policies must choose between the robustness of reactive proprioceptive control or the proactivity of complex, fragile perception-driven systems. This paper resolves this dilemma by introducing a paradigm that imbues a purely proprioceptive policy with proactive capabilities, achieving the foresight of perception without its deployment-time costs. Our core contribution is a contrastive learning framework that compels the actor's latent state to encode privileged environmental information from simulation. Crucially, this ``distilled awareness" empowers an adaptive gait clock, allowing the policy to proactively adjust its rhythm based on an inferred understanding of the terrain. This synergy resolves the classic trade-off between rigid, clocked gaits and unstable clock-free policies. We validate our approach with zero-shot sim-to-real transfer to a full-sized humanoid, demonstrating highly robust locomotion over challenging terrains, including 30 cm high steps and 26.5° slopes, proving the effectiveness of our method. Website: https://lu-yidan.github.io/cra-loco.
Problem

Research questions and friction points this paper is trying to address.

Resolves trade-off between reactive proprioception and proactive perception
Enables robust humanoid locomotion on challenging terrains
Achieves zero-shot sim-to-real transfer without deployment costs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive learning for environmental encoding
Distilled awareness enabling adaptive gait clock
Zero-shot sim-to-real transfer validation
🔎 Similar Papers
No similar papers found.
Y
Yidan Lu
Adaptive Robotic Controls Lab (ArcLab), Department of Mechanical Engineering, The University of Hong Kong, Hong Kong SAR, China
R
Rurui Yang
PNDbotics, China
Q
Qiran Kou
PNDbotics, China
Mengting Chen
Mengting Chen
Alibaba Group
Generative ModelingComputer Vision
Tao Fan
Tao Fan
Scichuan University of Science and Engineering
Synchronization of complex networksConsensus of multi-agent systemsWireless sensor networks
P
Peter Cui
PNDbotics, China
Y
Yinzhao Dong
Adaptive Robotic Controls Lab (ArcLab), Department of Mechanical Engineering, The University of Hong Kong, Hong Kong SAR, China
P
Peng Lu
Adaptive Robotic Controls Lab (ArcLab), Department of Mechanical Engineering, The University of Hong Kong, Hong Kong SAR, China