MS-PPO: Morphological-Symmetry-Equivariant Policy for Legged Robot Locomotion

📅 2025-11-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Reinforcement learning (RL) policies for legged robots often neglect morphological structure and symmetry, leading to inefficient training and poor generalization. To address this, we propose a graph neural RL framework that jointly incorporates kinematic structural modeling and equivariant constraints enforcing symmetry. Our key contribution is the first design of a morphology-aware equivariant graph neural policy network, which explicitly encodes the robot’s topological structure and group actions of its symmetry group—ensuring policy outputs transform equivariantly and value functions remain invariant under symmetry transformations, without requiring reward shaping or data augmentation. We validate the framework in simulation and on multiple physical robot platforms. Results demonstrate significant improvements in sample efficiency (average +42%) and cross-gait/cross-platform generalization, while maintaining high stability in complex dynamic environments.

Technology Category

Application Category

📝 Abstract
Reinforcement learning has recently enabled impressive locomotion capabilities on legged robots; however, most policy architectures remain morphology- and symmetry-agnostic, leading to inefficient training and limited generalization. This work introduces MS-PPO, a morphological-symmetry-equivariant policy learning framework that encodes robot kinematic structure and morphological symmetries directly into the policy network. We construct a morphology-informed graph neural architecture that is provably equivariant with respect to the robot's morphological symmetry group actions, ensuring consistent policy responses under symmetric states while maintaining invariance in value estimation. This design eliminates the need for tedious reward shaping or costly data augmentation, which are typically required to enforce symmetry. We evaluate MS-PPO in simulation on Unitree Go2 and Xiaomi CyberDog2 robots across diverse locomotion tasks, including trotting, pronking, slope walking, and bipedal turning, and further deploy the learned policies on hardware. Extensive experiments show that MS-PPO achieves superior training stability, symmetry generalization ability, and sample efficiency in challenging locomotion tasks, compared to state-of-the-art baselines. These findings demonstrate that embedding both kinematic structure and morphological symmetry into policy learning provides a powerful inductive bias for legged robot locomotion control. Our code will be made publicly available at https://lunarlab-gatech.github.io/MS-PPO/.
Problem

Research questions and friction points this paper is trying to address.

Develops a symmetry-equivariant policy for legged robot locomotion control
Encodes robot kinematic structure and morphological symmetries into policy network
Improves training stability, generalization, and sample efficiency in locomotion tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph neural network encodes robot kinematic structure
Equivariant policy ensures consistent responses under symmetry
Eliminates need for reward shaping or data augmentation
🔎 Similar Papers
2024-03-26IEEE/RJS International Conference on Intelligent RObots and SystemsCitations: 8
Sizhe Wei
Sizhe Wei
Georgia Institute of Technology
Robotics
X
Xulin Chen
Syracuse University, NY 13244, USA
Fengze Xie
Fengze Xie
California Institute of Technology
RoboticsRobot LearningControl
G
Garrett Ethan Katz
Syracuse University, NY 13244, USA
Zhenyu Gan
Zhenyu Gan
Aerospace and Mechanical Engineering Department, Syracuse University
Legged Locomotion
L
Lu Gan
Lunar Lab, Georgia Institute of Technology, GA 30332, USA