🤖 AI Summary
Quadruped robots exhibit insufficient locomotion robustness on vertically oscillating terrain (e.g., vibrating bridges), where conventional control methods fail to generalize across dynamic, non-stationary ground properties.
Method: This paper proposes a zero-shot transfer reinforcement learning framework tailored for dynamic disturbance environments. We construct a MuJoCo-based oscillating bridge simulation incorporating resonance dynamics and employ domain randomization with PPO to train policies for the Unitree Go2 robot. Crucially, we introduce a novel multi-height adjustment mechanism and gait composition strategy, enabling zero-shot adaptation across varying ground stiffnesses—without prior knowledge of disturbance characteristics.
Results: Policies trained on the simulated oscillating bridge achieve a 37% improvement in stability when deployed on real vibrating bridges, significantly outperforming rigid-ground baselines in disturbance rejection and cross-domain generalization. This work establishes a transferable, low-dependency paradigm for quadrupedal locomotion on dynamic, unstructured terrain.
📝 Abstract
Legged robots, particularly quadrupeds, excel at navigating rough terrains, yet their performance under vertical ground perturbations, such as those from oscillating surfaces, remains underexplored. This study introduces a novel approach to enhance quadruped locomotion robustness by training the Unitree Go2 robot on an oscillating bridge - a 13.24-meter steel-and-concrete structure with a 2.0 Hz eigenfrequency designed to perturb locomotion. Using Reinforcement Learning (RL) with the Proximal Policy Optimization (PPO) algorithm in a MuJoCo simulation, we trained 15 distinct locomotion policies, combining five gaits (trot, pace, bound, free, default) with three training conditions: rigid bridge and two oscillating bridge setups with differing height regulation strategies (relative to bridge surface or ground). Domain randomization ensured zero-shot transfer to the real-world bridge. Our results demonstrate that policies trained on the oscillating bridge exhibit superior stability and adaptability compared to those trained on rigid surfaces. Our framework enables robust gait patterns even without prior bridge exposure. These findings highlight the potential of simulation-based RL to improve quadruped locomotion during dynamic ground perturbations, offering insights for designing robots capable of traversing vibrating environments.