Chasing Autonomy: Dynamic Retargeting and Control Guided RL for Performant and Controllable Humanoid Running

📅 2026-03-26

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

Existing reinforcement learning controllers for humanoid robots struggle to achieve sustained autonomous dynamic running. This work proposes a method that combines hard-constrained trajectory optimization with dynamic motion retargeting to generate a high-quality library of periodic reference motions from a single segment of human running data. A goal-conditioned and control-guided reward mechanism is designed to train the reinforcement learning policy. The approach enables, for the first time, high-speed (3.3 m/s), long-distance (hundreds of meters) autonomous running on a real humanoid robot (Unitree G1). Furthermore, it is successfully integrated into a full-stack perception-planning system that supports real-time dynamic obstacle avoidance, significantly enhancing the robot’s locomotion capabilities and task scalability in complex outdoor environments.

Technology Category

Application Category

📝 Abstract

Humanoid robots have the promise of locomoting like humans, including fast and dynamic running. Recently, reinforcement learning (RL) controllers that can mimic human motions have become popular as they can generate very dynamic behaviors, but they are often restricted to single motion play-back which hinders their deployment in long duration and autonomous locomotion. In this paper, we present a pipeline to dynamically retarget human motions through an optimization routine with hard constraints to generate improved periodic reference libraries from a single human demonstration. We then study the effect of both the reference motion and the reward structure on the reference and commanded velocity tracking, concluding that a goal-conditioned and control-guided reward which tracks dynamically optimized human data results in the best performance. We deploy the policy on hardware, demonstrating its speed and endurance by achieving running speeds of up to 3.3 m/s on a Unitree G1 robot and traversing hundreds of meters in real-world environments. Additionally, to demonstrate the controllability of the locomotion, we use the controller in a full perception and planning autonomy stack for obstacle avoidance while running outdoors.

Problem

Research questions and friction points this paper is trying to address.

humanoid running

autonomous locomotion

dynamic retargeting

controllability

reinforcement learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamic retargeting

control-guided reinforcement learning

humanoid running