Efficiently Learning Robust Torque-based Locomotion Through Reinforcement with Model-Based Supervision

📅 2026-01-22
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the instability in bipedal walking caused by dynamic modeling errors and sensor noise by proposing a hybrid approach that combines a model-based controller with residual reinforcement learning. The method leverages an “oracle” policy—derived from accurate system dynamics—as a supervisory signal to guide the learning of a residual policy, thereby circumventing the need for intricate reward design and efficiently compensating for unmodeled effects. Integrating Divergent Component of Motion (DCM) trajectory planning, whole-body torque control, domain randomization, and a model-supervised loss function, the approach significantly enhances walking robustness and generalization under diverse disturbances. This framework offers a scalable solution for sim-to-real transfer, effectively bridging the gap between simulation and real-world deployment.

Technology Category

Application Category

📝 Abstract
We propose a control framework that integrates model-based bipedal locomotion with residual reinforcement learning (RL) to achieve robust and adaptive walking in the presence of real-world uncertainties. Our approach leverages a model-based controller, comprising a Divergent Component of Motion (DCM) trajectory planner and a whole-body controller, as a reliable base policy. To address the uncertainties of inaccurate dynamics modeling and sensor noise, we introduce a residual policy trained through RL with domain randomization. Crucially, we employ a model-based oracle policy, which has privileged access to ground-truth dynamics during training, to supervise the residual policy via a novel supervised loss. This supervision enables the policy to efficiently learn corrective behaviors that compensate for unmodeled effects without extensive reward shaping. Our method demonstrates improved robustness and generalization across a range of randomized conditions, offering a scalable solution for sim-to-real transfer in bipedal locomotion.
Problem

Research questions and friction points this paper is trying to address.

bipedal locomotion
robustness
real-world uncertainties
torque-based control
sim-to-real transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

residual reinforcement learning
model-based supervision
bipedal locomotion
sim-to-real transfer
DCM trajectory planning
🔎 Similar Papers
No similar papers found.