Adaptive Legged Locomotion via Online Learning for Model Predictive Control

📅 2025-10-17

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses adaptive leg control for quadrupedal robots operating under unknown payloads and uneven terrain. We propose a closed-loop framework integrating online learning with model predictive control (MPC). Specifically, we design an online residual dynamics learner based on Random Fourier Features (RFF) in a reproducing kernel Hilbert space, enabling efficient approximation of modeling errors and external disturbances; theoretical analysis guarantees a sublinear dynamic regret bound. The learner continuously refines the prediction model used by MPC, which in turn optimizes control inputs in real time. Extensive evaluations in Gazebo and MuJoCo demonstrate robust performance under severe disturbances: stable locomotion on 20° slopes, over 0.25 m height-varied terrain, under 12g impulsive impacts, with 8 kg added payload, and under time-varying friction. The approach achieves significantly improved trajectory tracking accuracy and outperforms conventional MPC in robustness.

Technology Category

Application Category

📝 Abstract

We provide an algorithm for adaptive legged locomotion via online learning and model predictive control. The algorithm is composed of two interacting modules: model predictive control (MPC) and online learning of residual dynamics. The residual dynamics can represent modeling errors and external disturbances. We are motivated by the future of autonomy where quadrupeds will autonomously perform complex tasks despite real-world unknown uncertainty, such as unknown payload and uneven terrains. The algorithm uses random Fourier features to approximate the residual dynamics in reproducing kernel Hilbert spaces. Then, it employs MPC based on the current learned model of the residual dynamics. The model is updated online in a self-supervised manner using least squares based on the data collected while controlling the quadruped. The algorithm enjoys sublinear extit{dynamic regret}, defined as the suboptimality against an optimal clairvoyant controller that knows how the residual dynamics. We validate our algorithm in Gazebo and MuJoCo simulations, where the quadruped aims to track reference trajectories. The Gazebo simulations include constant unknown external forces up to $12oldsymbol{g}$, where $oldsymbol{g}$ is the gravity vector, in flat terrain, slope terrain with $20degree$ inclination, and rough terrain with $0.25m$ height variation. The MuJoCo simulations include time-varying unknown disturbances with payload up to $8~kg$ and time-varying ground friction coefficients in flat terrain.

Problem

Research questions and friction points this paper is trying to address.

Developing adaptive legged locomotion for quadrupeds under uncertainty

Online learning of residual dynamics to handle modeling errors

Enabling robust trajectory tracking despite unknown payloads and terrains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online learning of residual dynamics via random Fourier features

Model predictive control with self-supervised model updates

Sublinear dynamic regret against clairvoyant controller performance

🔎 Similar Papers

No similar papers found.