Learning Quadrupedal Locomotion via Differentiable Simulation

📅 2024-04-03

🏛️ arXiv.org

📈 Citations: 6

✨ Influential: 1

career value

210K/year

🤖 AI Summary

Differentiable simulators suffer from gradient distortion in contact-rich tasks (e.g., quadrupedal locomotion) due to the nonsmoothness of contact dynamics; existing soft-contact models trade off physical fidelity for differentiability. Method: We propose a novel framework integrating high-fidelity contact modeling with efficient policy optimization. Specifically, we systematically validate the validity of analytical gradients under strong contact dynamics; comparatively analyze how soft versus hard contact models affect gradient-based learning; and introduce a short-horizon Actor-Critic (SHAC) architecture to mitigate optimization difficulties arising from contact discontinuities. Contribution/Results: Our approach preserves physical plausibility while achieving significantly higher sample efficiency than PPO. Experiments demonstrate substantial improvements in adaptation to complex terrains and convergence speed, establishing a new trade-off frontier between differentiability, physical accuracy, and learning efficiency in contact-rich robotic control.

Technology Category

Application Category

📝 Abstract

The emergence of differentiable simulators enabling analytic gradient computation has motivated a new wave of learning algorithms that hold the potential to significantly increase sample efficiency over traditional Reinforcement Learning (RL) methods. While recent research has demonstrated performance gains in scenarios with comparatively smooth dynamics and, thus, smooth optimization landscapes, research on leveraging differentiable simulators for contact-rich scenarios, such as legged locomotion, is scarce. This may be attributed to the discontinuous nature of contact, which introduces several challenges to optimizing with analytic gradients. The purpose of this paper is to determine if analytic gradients can be beneficial even in the face of contact. Our investigation focuses on the effects of different soft and hard contact models on the learning process, examining optimization challenges through the lens of contact simulation. We demonstrate the viability of employing analytic gradients to learn physically plausible locomotion skills with a quadrupedal robot using Short-Horizon Actor-Critic (SHAC), a learning algorithm leveraging analytic gradients, and draw a comparison to a state-of-the-art RL algorithm, Proximal Policy Optimization (PPO), to understand the benefits of analytic gradients.

Problem

Research questions and friction points this paper is trying to address.

Addresses non-smooth contact gradients in differentiable simulation

Enables accurate sim-to-real transfer for legged locomotion

Develops physically faithful contact model for gradient optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiable contact model for physical accuracy

Quadrupedal locomotion policy trained via analytic gradients

Zero-shot sim-to-real transfer for legged robots

🔎 Similar Papers

No similar papers found.