🤖 AI Summary
This work addresses the challenge of achieving energy-efficient yet stable bipedal locomotion in humanoid robots, a task hindered by the complexity of multi-objective reward design in existing reinforcement learning (RL) approaches, which often leads to suboptimal policies and difficult hyperparameter tuning. To overcome this, the authors propose the ECO framework, which, for the first time, explicitly incorporates energy consumption as an inequality constraint within the RL formulation. By leveraging the method of Lagrange multipliers, ECO jointly optimizes reference motion tracking and energy usage under a physically interpretable objective, substantially simplifying hyperparameter adjustment. Experiments on the BRUCE humanoid demonstrate that ECO significantly reduces energy consumption while maintaining robust and symmetric gait patterns, outperforming model predictive control (MPC), standard RL, and four state-of-the-art constrained RL methods, with successful transfer demonstrated across both sim-to-sim and sim-to-real settings.
📝 Abstract
Achieving stable and energy-efficient locomotion is essential for humanoid robots to operate continuously in real-world applications. Existing MPC and RL approaches often rely on energy-related metrics embedded within a multi-objective optimization framework, which require extensive hyperparameter tuning and often result in suboptimal policies. To address these challenges, we propose ECO (Energy-Constrained Optimization), a constrained RL framework that separates energy-related metrics from rewards, reformulating them as explicit inequality constraints. This method provides a clear and interpretable physical representation of energy costs, enabling more efficient and intuitive hyperparameter tuning for improved energy efficiency. ECO introduces dedicated constraints for energy consumption and reference motion, enforced by the Lagrangian method, to achieve stable, symmetric, and energy-efficient walking for humanoid robots. We evaluated ECO against MPC, standard RL with reward shaping, and four state-of-the-art constrained RL methods. Experiments, including sim-to-sim and sim-to-real transfers on the kid-sized humanoid robot BRUCE, demonstrate that ECO significantly reduces energy consumption compared to baselines while maintaining robust walking performance. These results highlight a substantial advancement in energy-efficient humanoid locomotion. All experimental demonstrations can be found on the project website: https://sites.google.com/view/eco-humanoid.