🤖 AI Summary
This paper addresses the inefficiency and time-consuming nature of manual stiffness tuning for dynamic locomotion in legged robots. We propose an end-to-end adaptive framework that embeds variable-stiffness control directly into the reinforcement learning (RL) action space. Methodologically, we introduce a novel leg-wise or hybrid grouping strategy for stiffness parameterization, jointly modeling stiffness and joint positions within a hierarchical action space; this is trained using Proximal Policy Optimization (PPO), domain randomization, and sim-to-real transfer. Our contributions are: (1) the first RL-based joint optimization of stiffness and position, automatically balancing velocity tracking, disturbance rejection, and energy efficiency without manual tuning; (2) leg-wise stiffness modulation enhancing push recovery and trajectory tracking accuracy; (3) hybrid stiffness grouping significantly reducing energy consumption; and (4) robust real-world locomotion—on gravel, slopes, and grass—achieved with training solely on flat-terrain simulation.
📝 Abstract
Reinforcement-learned locomotion enables legged robots to perform highly dynamic motions but often accompanies time-consuming manual tuning of joint stiffness. This paper introduces a novel control paradigm that integrates variable stiffness into the action space alongside joint positions, enabling grouped stiffness control such as per-joint stiffness (PJS), per-leg stiffness (PLS) and hybrid joint-leg stiffness (HJLS). We show that variable stiffness policies, with grouping in per-leg stiffness (PLS), outperform position-based control in velocity tracking and push recovery. In contrast, HJLS excels in energy efficiency. Furthermore, our method showcases robust walking behaviour on diverse outdoor terrains by sim-to-real transfer, although the policy is sorely trained on a flat floor. Our approach simplifies design by eliminating per-joint stiffness tuning while keeping competitive results with various metrics.