Terrain Consistent Reference-Guided RL for Humanoid Navigation Autonomy

πŸ“… 2026-05-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

225K/year
πŸ€– AI Summary
This work addresses the challenge of long-range navigation failure in humanoid robots caused by incompatibility between SE(2) reference trajectories and complex terrain geometry. To resolve this, the authors propose a terrain-aware reinforcement learning framework that dynamically modulates SE(2) reference trajectories during training to ensure terrain compatibility. The approach integrates foothold projection, center-of-mass trajectory adaptation, and swing-leg trajectory adjustment based on terrain geometry, while leveraging model predictive control (MPC) and control barrier functions for coordinated planning and control. The resulting gait references seamlessly interface with standard navigation stacks through an SE(2) velocity command. Simulations demonstrate significantly improved trajectory tracking performance, and real-world experiments on the Unitree G1 platform achieve fully onboard, closed-loop autonomous navigation over more than 70 meters of outdoor complex terrain, including consecutive staircases.
πŸ“ Abstract
We present a method for training reference-guided, perceptive reinforcement learning locomotion policies for humanoid robots in which reference trajectories are modulated in training to be consistent with terrain geometry. Aiming to deploy our method with standard navigation autonomy infrastructure, we synthesize SE(2)-controllable reference trajectories inside the RL training loop, projecting desired footsteps onto valid footholds and adjusting swing-foot and center-of-mass trajectories to match the terrain. The resulting policy exposes a clean SE(2) velocity interface compatible with standard navigation planners. In simulation, environmentally-conditioned references significantly improve reference tracking performance compared to environment agnostic references. On hardware, we integrate the policy with an MPC + control barrier function planner and demonstrate long-horizon (>70m) closed-loop autonomous navigation on the Unitree G1 through outdoor environments containing rough terrain and consecutive flights of stairs, with all sensing and computation onboard.
Problem

Research questions and friction points this paper is trying to address.

humanoid navigation
terrain consistency
reference-guided RL
autonomous locomotion
SE(2) control
Innovation

Methods, ideas, or system contributions that make the work stand out.

reference-guided RL
terrain-consistent locomotion
SE(2)-controllable policy
humanoid navigation autonomy
onboard perception and control
πŸ”Ž Similar Papers