Reinforcement Learning with Data Bootstrapping for Dynamic Subgoal Pursuit in Humanoid Robot Navigation

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Addressing the challenge of balancing computational efficiency and motion accuracy in real-time safe navigation for humanoid robots, this paper proposes a dynamic subgoal-driven hierarchical navigation framework. At the high level, deep reinforcement learning (PPO/SAC) operates in the robot’s body-fixed frame to dynamically generate semantic subgoals; at the low level, model predictive control (MPC), integrated with kinematic modeling, synthesizes stable and robust gait trajectories. A novel model-guided data bootstrapping mechanism is introduced to significantly improve RL training efficiency and policy stability. Evaluated on the Digit robot simulation platform, the method achieves superior navigation success rates and generalization capability compared to both traditional model-based and state-of-the-art end-to-end learning approaches in complex cluttered environments. It simultaneously guarantees real-time performance (>30 Hz) and strong robustness against environmental uncertainties and dynamic disturbances.

Technology Category

Application Category

📝 Abstract

Safe and real-time navigation is fundamental for humanoid robot applications. However, existing bipedal robot navigation frameworks often struggle to balance computational efficiency with the precision required for stable locomotion. We propose a novel hierarchical framework that continuously generates dynamic subgoals to guide the robot through cluttered environments. Our method comprises a high-level reinforcement learning (RL) planner for subgoal selection in a robot-centric coordinate system and a low-level Model Predictive Control (MPC) based planner which produces robust walking gaits to reach these subgoals. To expedite and stabilize the training process, we incorporate a data bootstrapping technique that leverages a model-based navigation approach to generate a diverse, informative dataset. We validate our method in simulation using the Agility Robotics Digit humanoid across multiple scenarios with random obstacles. Results show that our framework significantly improves navigation success rates and adaptability compared to both the original model-based method and other learning-based methods.

Problem

Research questions and friction points this paper is trying to address.

Balancing computational efficiency and locomotion precision in humanoid robot navigation

Generating dynamic subgoals for navigation in cluttered environments

Improving training stability and speed with data bootstrapping techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical RL-MPC framework for subgoal navigation

Data bootstrapping enhances training stability

Robot-centric dynamic subgoal generation

🔎 Similar Papers

Learning Adaptive Multi-Objective Robot Navigation Incorporating Demonstrations