No More Marching: Learning Humanoid Locomotion for Short-Range SE(2) Targets

📅 2025-08-16

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Traditional velocity-tracking controllers for short-range SE(2) pose navigation in humanoid robots induce inefficient, “marching-like” locomotion. Method: This paper proposes an end-to-end reinforcement learning framework that directly optimizes pose reaching—bypassing intermediate velocity trajectory tracking. We introduce a sparse reward function based on a constellation-inspired geometric structure to encourage natural, energy-efficient target-oriented motion; design a multi-objective evaluation benchmark integrating energy consumption, task completion time, and step count; and employ SE(2)-encoded goal representations with curriculum learning to enhance sim-to-real policy transfer. Contribution/Results: Experiments demonstrate significant improvements over baseline methods across all metrics, including reduced energy use, shorter execution time, and fewer steps. The learned policy is successfully deployed on a real humanoid robot platform, validating its practical efficacy and generalizability.

Technology Category

Application Category

📝 Abstract

Humanoids operating in real-world workspaces must frequently execute task-driven, short-range movements to SE(2) target poses. To be practical, these transitions must be fast, robust, and energy efficient. While learning-based locomotion has made significant progress, most existing methods optimize for velocity-tracking rather than direct pose reaching, resulting in inefficient, marching-style behavior when applied to short-range tasks. In this work, we develop a reinforcement learning approach that directly optimizes humanoid locomotion for SE(2) targets. Central to this approach is a new constellation-based reward function that encourages natural and efficient target-oriented movement. To evaluate performance, we introduce a benchmarking framework that measures energy consumption, time-to-target, and footstep count on a distribution of SE(2) goals. Our results show that the proposed approach consistently outperforms standard methods and enables successful transfer from simulation to hardware, highlighting the importance of targeted reward design for practical short-range humanoid locomotion.

Problem

Research questions and friction points this paper is trying to address.

Optimizing humanoid locomotion for short-range SE(2) target poses

Eliminating inefficient marching-style movement in short-range tasks

Developing energy-efficient transitions to specific positional goals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning optimizes SE(2) target locomotion

Constellation-based reward encourages natural movement

Benchmarking framework measures energy and performance

🔎 Similar Papers

No similar papers found.