Learning Humanoid Arm Motion via Centroidal Momentum Regularized Multi-Agent Reinforcement Learning

📅 2025-07-05
📈 Citations: 0
Influential: 0
📄 PDF

career value

220K/year
🤖 AI Summary
Addressing the challenge of coordinating arm swing with whole-body dynamics during bipedal locomotion, this paper proposes a center-of-mass (CoM) angular momentum-regularized multi-agent reinforcement learning framework. It incorporates CoM angular momentum feedback as an arm control signal and designs a modular hierarchical reward structure to enable decoupled yet dynamically coordinated arm–leg training. Leveraging a decentralized-actor centralized-critic (DAC) architecture, the method integrates cross-agent observation sharing via a centralized attention module (CAM) and employs multi-agent proximal policy optimization (PPO) for end-to-end training. Evaluated on a physical humanoid platform, the approach significantly enhances dynamic balance and disturbance rejection across diverse locomotion tasks—including flat-ground walking, traversal of unstructured terrain, and stair ascent/descent—outperforming single-agent baselines in both stability and robustness.

Technology Category

Application Category

📝 Abstract
Humans naturally swing their arms during locomotion to regulate whole-body dynamics, reduce angular momentum, and help maintain balance. Inspired by this principle, we present a limb-level multi-agent reinforcement learning (RL) framework that enables coordinated whole-body control of humanoid robots through emergent arm motion. Our approach employs separate actor-critic structures for the arms and legs, trained with centralized critics but decentralized actors that share only base states and centroidal angular momentum (CAM) observations, allowing each agent to specialize in task-relevant behaviors through modular reward design. The arm agent guided by CAM tracking and damping rewards promotes arm motions that reduce overall angular momentum and vertical ground reaction moments, contributing to improved balance during locomotion or under external perturbations. Comparative studies with single-agent and alternative multi-agent baselines further validate the effectiveness of our approach. Finally, we deploy the learned policy on a humanoid platform, achieving robust performance across diverse locomotion tasks, including flat-ground walking, rough terrain traversal, and stair climbing.
Problem

Research questions and friction points this paper is trying to address.

Develop multi-agent RL for humanoid arm-leg coordination
Reduce angular momentum via arm motion for balance
Enable robust locomotion on diverse terrains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent RL for humanoid arm-leg coordination
CAM-guided arm motion for balance enhancement
Modular reward design for specialized agent behaviors
🔎 Similar Papers
No similar papers found.