🤖 AI Summary
Existing imitation learning approaches predominantly focus on explicit gait trajectories while neglecting passive dynamics—a key factor underlying energy efficiency in biological locomotion.
Method: We propose a physics-aware learning framework for motion control. Specifically, we introduce the Impact Mitigation Factor (IMF) as a differentiable, physics-based reward term that explicitly encodes passive dynamics into reinforcement learning objectives. Integrating Adversarial Motion Priors (AMP) with the IMF-guided reward, our method jointly optimizes both explicit trajectory tracking and implicit energy-efficient mechanisms in an end-to-end manner within simulation.
Contribution/Results: Experiments demonstrate substantial reductions in cost of transport (CoT) across diverse locomotion tasks—up to 32% improvement—thereby effectively bridging the modeling gap between motion imitation and energetic efficiency. To our knowledge, this is the first work to explicitly formulate passive dynamics as an optimizable component of the RL reward structure for learned locomotion control.
📝 Abstract
Animals achieve energy-efficient locomotion by their implicit passive dynamics, a marvel that has captivated roboticists for decades.Recently, methods incorporated Adversarial Motion Prior (AMP) and Reinforcement learning (RL) shows promising progress to replicate Animals'naturalistic motion. However, such imitation learning approaches predominantly capture explicit kinematic patterns, so-called gaits, while overlooking the implicit passive dynamics. This work bridges this gap by incorporating a reward term guided by Impact Mitigation Factor (IMF), a physics-informed metric that quantifies a robot's ability to passively mitigate impacts. By integrating IMF with AMP, our approach enables RL policies to learn both explicit motion trajectories from animal reference motion and the implicit passive dynamic. We demonstrate energy efficiency improvements of up to 32%, as measured by the Cost of Transport (CoT), across both AMP and handcrafted reward structure.