🤖 AI Summary
Existing quadrupedal robot controllers rely heavily on predefined reference models, limiting adaptability across robots with diverse morphologies and dynamics. Method: This paper proposes Platform-Adaptive Locomotion (PAL), a deep reinforcement learning–based framework that trains a single policy mapping proprioceptive states and velocity commands to joint targets. PAL introduces a novel morphology-aware implicit dynamics conditioning mechanism—eliminating the need for explicit reference models—and integrates a GRU-based dynamics encoder with a morphology attribute estimator. Contribution/Results: PAL enables zero-shot transfer across unseen robots. On the ANYmal C hardware platform, the morphology-aware variant reduces speed tracking error by 30% compared to temporal encoding alone. Extensive evaluation across multiple simulated platforms demonstrates robust generalization and cross-platform adaptability.
📝 Abstract
This article presents Platform Adaptive Locomotion (PAL), a unified control method for quadrupedal robots with different morphologies and dynamics. We leverage deep reinforcement learning to train a single locomotion policy on procedurally generated robots. The policy maps proprioceptive robot state information and base velocity commands into desired joint actuation targets, which are conditioned using a latent embedding of the temporally local system dynamics. We explore two conditioning strategies - one using a GRU-based dynamics encoder and another using a morphology-based property estimator - and show that morphology-aware conditioning outperforms temporal dynamics encoding regarding velocity task tracking for our hardware test on ANYmal C. Our results demonstrate that both approaches achieve robust zero-shot transfer across multiple unseen simulated quadrupeds. Furthermore, we demonstrate the need for careful robot reference modelling during training, enabling us to reduce the velocity tracking error by up to 30% compared to the baseline method. Despite PAL not surpassing the best-performing reference-free controller in all cases, our analysis uncovers critical design choices and informs improvements to the state of the art.