Sim-to-Real of Humanoid Locomotion Policies via Joint Torque Space Perturbation Injection

📅 2025-04-09

📈 Citations: 0

✨ Influential: 0

career value

247K/year

🤖 AI Summary

Existing sim-to-real transfer methods predominantly rely on fixed-parameter domain randomization, which struggles to encompass complex and unseen reality gaps, resulting in poor generalization of bipedal locomotion policies. To address this, we propose State-Dependent Torque Perturbation (SDTP), a novel technique that injects dynamic, state-conditioned disturbances directly into the torque space during simulation training—thereby explicitly modeling nonlinear physical discrepancies present in real-world environments. SDTP represents the first approach to shift perturbation from conventional parameter domains to the torque domain, enabling robust policy learning via reinforcement learning integrated with forward dynamics simulation. We rigorously evaluate its out-of-distribution generalization capability. Experiments demonstrate that SDTP significantly improves transfer performance on real humanoid robots, achieving stable walking under diverse, unseen deviations in mass, friction, and actuator latency—outperforming standard domain randomization baselines in robustness.

Technology Category

Application Category

📝 Abstract

This paper proposes a novel alternative to existing sim-to-real methods for training control policies with simulated experiences. Prior sim-to-real methods for legged robots mostly rely on the domain randomization approach, where a fixed finite set of simulation parameters is randomized during training. Instead, our method adds state-dependent perturbations to the input joint torque used for forward simulation during the training phase. These state-dependent perturbations are designed to simulate a broader range of reality gaps than those captured by randomizing a fixed set of simulation parameters. Experimental results show that our method enables humanoid locomotion policies that achieve greater robustness against complex reality gaps unseen in the training domain.

Problem

Research questions and friction points this paper is trying to address.

Enhancing sim-to-real transfer for humanoid locomotion policies

Overcoming limitations of domain randomization in training

Improving robustness against unseen real-world conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

State-dependent joint torque perturbations

Broader reality gap simulation

Enhanced humanoid locomotion robustness

🔎 Similar Papers

I-CTRL: Imitation to Control Humanoid Robots Through Constrained Reinforcement Learning