Unified Walking, Running, and Recovery for Humanoids via State-Dependent Adversarial Motion Priors

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the challenge of enabling humanoid robots to seamlessly execute walking, running, and fall recovery without explicit mode-switching commands. The authors propose a unified reinforcement learning framework based on state-dependent adversarial motion priors (AMP), incorporating a projection-based gravity-threshold state-gating mechanism that dynamically activates either a velocity-conditioned or a recovery discriminator. Remarkably, the approach requires only three reference motions to span all behavioral modes. The resulting policy achieves end-to-end, logic-free control and runs in real time at 50 Hz on the Unitree G1 physical robot, successfully demonstrating prone and supine fall recovery as well as smooth transitions between walking and running. This significantly enhances behavioral generalization and deployment simplicity.

📝 Abstract

We propose a unified reinforcement learning framework that enables a single policy to perform walking, running, and fall recovery on the Unitree G1 humanoid robot, validated on physical hardware without any explicit mode-switching command at deployment. The framework extends Adversarial Motion Priors (AMP) by replacing the conventional global reference distribution with a state-dependent gate that routes each training transition to one of two discriminators: a dedicated recovery discriminator and a velocity-conditioned locomotion discriminator that jointly covers walking and running. The gate is defined by a single fixed threshold on projected gravity: the recovery discriminator is activated when body tilt exceeds approximately $37^\circ$ from vertical ($|g_z+1|>0.6$); otherwise the locomotion discriminator is used, with the normalized commanded velocity serving as a condition that selects the appropriate reference trajectory between walk and run clips. Only three LAFAN1 reference clips are required to regularize the complete behavior set. At deployment, a single frozen ONNX policy executes at 50\,Hz with no runtime mode logic; hardware experiments demonstrate successful recovery from both prone and supine falls and smooth walk-to-run transitions under the same controller.

Problem

Research questions and friction points this paper is trying to address.

humanoid locomotion

fall recovery

unified control policy

mode-free switching

bipedal motion

Innovation

Methods, ideas, or system contributions that make the work stand out.

state-dependent gating

adversarial motion priors

unified locomotion policy