Sampling Strategies for Robust Universal Quadrupedal Locomotion Policies

📅 2025-10-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited generalizability of universal locomotion policies for quadrupedal robots and poor Sim2Real transfer robustness. To this end, we propose a parameterized sampling strategy tailored for multi-configuration adaptation. Our method constructs a mass–gain mapping function to jointly randomize physical parameters and PD gains; integrates an adaptive performance filtering mechanism to increase the density of high-performing policies in the sampled distribution; and incorporates a prior reference model to guide the sampling space, thereby enhancing zero-shot adaptability to unseen robot configurations. A single PPO policy is trained in RaiSim and deployed zero-shot—without fine-tuning—on ANYmal hardware. Experiments demonstrate that our approach significantly narrows the Sim2Real gap, maintaining stable locomotion across diverse mass, inertia, and joint gain configurations. It outperforms baseline methods in both generalization capability and robustness.

Technology Category

Application Category

📝 Abstract
This work focuses on sampling strategies of configuration variations for generating robust universal locomotion policies for quadrupedal robots. We investigate the effects of sampling physical robot parameters and joint proportional-derivative gains to enable training a single reinforcement learning policy that generalizes to multiple parameter configurations. Three fundamental joint gain sampling strategies are compared: parameter sampling with (1) linear and polynomial function mappings of mass-to-gains, (2) performance-based adaptive filtering, and (3) uniform random sampling. We improve the robustness of the policy by biasing the configurations using nominal priors and reference models. All training was conducted on RaiSim, tested in simulation on a range of diverse quadrupeds, and zero-shot deployed onto hardware using the ANYmal quadruped robot. Compared to multiple baseline implementations, our results demonstrate the need for significant joint controller gains randomization for robust closing of the sim-to-real gap.
Problem

Research questions and friction points this paper is trying to address.

Generating robust universal locomotion policies for quadruped robots
Investigating sampling strategies for robot parameters and joint gains
Improving policy robustness to close the sim-to-real gap
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sampling robot parameters and joint gains
Comparing linear polynomial and adaptive strategies
Biasing configurations with nominal priors models
🔎 Similar Papers
No similar papers found.