Walk the PLANC: Physics-Guided RL for Agile Humanoid Locomotion on Constrained Footholds

📅 2026-01-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the challenge of enabling humanoid robots to walk on constrained footholds such as stepping stones, which requires precise coordination of balance, timing, and contact decisions. While conventional approaches rely on high-fidelity terrain modeling and pure reinforcement learning struggles to produce accurate step sequences, the authors propose a physics-informed reinforcement learning framework. This framework integrates dynamically consistent stepping targets generated by a reduced-order model gait planner into policy training and employs a Control Lyapunov Function (CLF)-based reward design to encode structured physical priors instead of relying solely on reward shaping. The method enhances foothold accuracy and task success without requiring precise environmental perception, achieving highly reliable stepping-stone locomotion on a real humanoid robot and significantly outperforming model-free reinforcement learning baselines.

Technology Category

Application Category

📝 Abstract

Bipedal humanoid robots must precisely coordinate balance, timing, and contact decisions when locomoting on constrained footholds such as stepping stones, beams, and planks -- even minor errors can lead to catastrophic failure. Classical optimization and control pipelines handle these constraints well but depend on highly accurate mathematical representations of terrain geometry, making them prone to error when perception is noisy or incomplete. Meanwhile, reinforcement learning has shown strong resilience to disturbances and modeling errors, yet end-to-end policies rarely discover the precise foothold placement and step sequencing required for discontinuous terrain. These contrasting limitations motivate approaches that guide learning with physics-based structure rather than relying purely on reward shaping. In this work, we introduce a locomotion framework in which a reduced-order stepping planner supplies dynamically consistent motion targets that steer the RL training process via Control Lyapunov Function (CLF) rewards. This combination of structured footstep planning and data-driven adaptation produces accurate, agile, and hardware-validated stepping-stone locomotion on a humanoid robot, substantially improving reliability compared to conventional model-free reinforcement-learning baselines.

Problem

Research questions and friction points this paper is trying to address.

humanoid locomotion

constrained footholds

stepping-stone navigation

balance control

contact planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-Guided Reinforcement Learning

Control Lyapunov Function

Reduced-Order Stepping Planner