LIPM-Guided Reinforcement Learning for Stable and Perceptive Locomotion in Bipedal Robots

📅 2025-09-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Achieving stable, perception-guided locomotion for bipedal robots in unstructured outdoor environments remains challenging due to complex terrain geometry and external disturbances. Method: This paper proposes a Linear Inverted Pendulum Model (LIPM)-guided reinforcement learning framework. The LIPM is explicitly embedded into the reward function to theoretically govern coordinated regulation of center-of-mass height and torso orientation. A Reward Fusion Module (RFM) dynamically balances velocity tracking and stability objectives, while a dual-critic architecture enhances policy robustness against disturbances. Contributions/Results: Evaluated jointly in simulation and on a physical robot, the method achieves stable walking across diverse rugged terrains—including stairs, slopes, and uneven ground—under persistent external perturbations. It significantly improves training efficiency, disturbance rejection, speed adaptability, and cross-terrain generalization. Crucially, the LIPM-informed design ensures interpretability and deployability, establishing a new paradigm for field-deployable bipedal navigation.

Technology Category

Application Category

📝 Abstract
Achieving stable and robust perceptive locomotion for bipedal robots in unstructured outdoor environments remains a critical challenge due to complex terrain geometry and susceptibility to external disturbances. In this work, we propose a novel reward design inspired by the Linear Inverted Pendulum Model (LIPM) to enable perceptive and stable locomotion in the wild. The LIPM provides theoretical guidance for dynamic balance by regulating the center of mass (CoM) height and the torso orientation. These are key factors for terrain-aware locomotion, as they help ensure a stable viewpoint for the robot's camera. Building on this insight, we design a reward function that promotes balance and dynamic stability while encouraging accurate CoM trajectory tracking. To adaptively trade off between velocity tracking and stability, we leverage the Reward Fusion Module (RFM) approach that prioritizes stability when needed. A double-critic architecture is adopted to separately evaluate stability and locomotion objectives, improving training efficiency and robustness. We validate our approach through extensive experiments on a bipedal robot in both simulation and real-world outdoor environments. The results demonstrate superior terrain adaptability, disturbance rejection, and consistent performance across a wide range of speeds and perceptual conditions.
Problem

Research questions and friction points this paper is trying to address.

Achieving stable perceptive locomotion in unstructured outdoor environments
Addressing complex terrain geometry and external disturbances for bipedal robots
Balancing velocity tracking with dynamic stability in varying conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

LIPM-inspired reward design for stable locomotion
Reward Fusion Module adaptively prioritizes stability
Double-critic architecture evaluates stability and locomotion separately
🔎 Similar Papers
H
Haokai Su
School of Automation and Intelligent Manufacturing (AIM), Southern University of Science and Technology, Shenzhen, China
Haoxiang Luo
Haoxiang Luo
Professor of Mechanical Engineering, Vanderbilt University
Fluid Mechanicscomputational fluid dynamicsbiofluidfluid-structure interaction
S
Shunpeng Yang
School of Automation and Intelligent Manufacturing (AIM), Southern University of Science and Technology, Shenzhen, China; Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, China
K
Kaiwen Jiang
Department of Mechanical Engineering, The University of Hong Kong, Hong Kong, China
W
Wei Zhang
School of Automation and Intelligent Manufacturing (AIM), Southern University of Science and Technology, Shenzhen, China; LimX Dynamics, Shenzhen, China
H
Hua Chen
Zhejiang University-University of Illinois Urbana-Champaign Institute (ZJUI), Zhejiang University, Haining, China; LimX Dynamics, Shenzhen, China