Reinforcement Learning with Data Bootstrapping for Dynamic Subgoal Pursuit in Humanoid Robot Navigation

πŸ“… 2025-06-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Addressing the challenge of balancing computational efficiency and motion accuracy in real-time safe navigation for humanoid robots, this paper proposes a dynamic subgoal-driven hierarchical navigation framework. At the high level, deep reinforcement learning (PPO/SAC) operates in the robot’s body-fixed frame to dynamically generate semantic subgoals; at the low level, model predictive control (MPC), integrated with kinematic modeling, synthesizes stable and robust gait trajectories. A novel model-guided data bootstrapping mechanism is introduced to significantly improve RL training efficiency and policy stability. Evaluated on the Digit robot simulation platform, the method achieves superior navigation success rates and generalization capability compared to both traditional model-based and state-of-the-art end-to-end learning approaches in complex cluttered environments. It simultaneously guarantees real-time performance (>30 Hz) and strong robustness against environmental uncertainties and dynamic disturbances.

Technology Category

Application Category

πŸ“ Abstract
Safe and real-time navigation is fundamental for humanoid robot applications. However, existing bipedal robot navigation frameworks often struggle to balance computational efficiency with the precision required for stable locomotion. We propose a novel hierarchical framework that continuously generates dynamic subgoals to guide the robot through cluttered environments. Our method comprises a high-level reinforcement learning (RL) planner for subgoal selection in a robot-centric coordinate system and a low-level Model Predictive Control (MPC) based planner which produces robust walking gaits to reach these subgoals. To expedite and stabilize the training process, we incorporate a data bootstrapping technique that leverages a model-based navigation approach to generate a diverse, informative dataset. We validate our method in simulation using the Agility Robotics Digit humanoid across multiple scenarios with random obstacles. Results show that our framework significantly improves navigation success rates and adaptability compared to both the original model-based method and other learning-based methods.
Problem

Research questions and friction points this paper is trying to address.

Balancing computational efficiency and locomotion precision in humanoid robot navigation
Generating dynamic subgoals for navigation in cluttered environments
Improving training stability and speed with data bootstrapping techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical RL-MPC framework for subgoal navigation
Data bootstrapping enhances training stability
Robot-centric dynamic subgoal generation
Chengyang Peng
Chengyang Peng
The Ohio State University
Robotics
Z
Zhihao Zhang
Electrical and Computer Engineering, The Ohio State University, Columbus, OH, USA
Shiting Gong
Shiting Gong
The Ohio State University
Robotics
S
Sankalp Agrawal
Electrical and Computer Engineering, The Ohio State University, Columbus, OH, USA
K
Keith A. Redmill
Electrical and Computer Engineering, The Ohio State University, Columbus, OH, USA
Ayonga Hereid
Ayonga Hereid
Assistant Professor, The Ohio State University
RoboticsCyber Physical SystemsOptimal ControlMachine Learning