Contrastive Representation Learning for Robust Sim-to-Real Transfer of Adaptive Humanoid Locomotion

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

Pure proprioceptive locomotion policies for humanoid robots lack environmental foresight, whereas perception-driven systems are vulnerable to sensor noise and environmental disturbances. Method: We propose a “distilled perception” framework that leverages contrastive representation learning to encode implicit terrain and dynamics information—acquired in simulation—into a latent space, thereby endowing proprioceptive policies with environmental anticipation capability. This latent representation drives an adaptive gait clock, reconciling the trade-off between rigid rhythmic control and clock-free strategies. The approach integrates privileged information distillation, latent-variable modeling, and reinforcement learning to enable zero-shot sim-to-real transfer. Results: Evaluated on a full-scale humanoid robot, the method achieves robust traversal of 30 cm steps and 26.5° inclines without real-world sensory input, demonstrating strong robustness and generalization across challenging unstructured terrains.

Technology Category

Application Category

📝 Abstract

Reinforcement learning has produced remarkable advances in humanoid locomotion, yet a fundamental dilemma persists for real-world deployment: policies must choose between the robustness of reactive proprioceptive control or the proactivity of complex, fragile perception-driven systems. This paper resolves this dilemma by introducing a paradigm that imbues a purely proprioceptive policy with proactive capabilities, achieving the foresight of perception without its deployment-time costs. Our core contribution is a contrastive learning framework that compels the actor's latent state to encode privileged environmental information from simulation. Crucially, this ``distilled awareness" empowers an adaptive gait clock, allowing the policy to proactively adjust its rhythm based on an inferred understanding of the terrain. This synergy resolves the classic trade-off between rigid, clocked gaits and unstable clock-free policies. We validate our approach with zero-shot sim-to-real transfer to a full-sized humanoid, demonstrating highly robust locomotion over challenging terrains, including 30 cm high steps and 26.5° slopes, proving the effectiveness of our method. Website: https://lu-yidan.github.io/cra-loco.

Problem

Research questions and friction points this paper is trying to address.

Resolves trade-off between reactive proprioception and proactive perception

Enables robust humanoid locomotion on challenging terrains

Achieves zero-shot sim-to-real transfer without deployment costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive learning for environmental encoding

Distilled awareness enabling adaptive gait clock

Zero-shot sim-to-real transfer validation

🔎 Similar Papers

Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation