APEX: Action Priors Enable Efficient Exploration for Robust Motion Tracking on Legged Robots

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing legged robot motion tracking methods rely heavily on predefined reference trajectories and extensive manual parameter tuning, limiting generalization and deployment efficiency. This paper proposes APEX, a plug-and-play reinforcement learning extension framework that incorporates expert demonstrations into training via a decaying action prior mechanism—eliminating the need for reference data during deployment. By integrating a multi-critic architecture with policy regularization, APEX enhances sample efficiency, robustness, and cross-terrain, cross-speed, and cross-gait generalization. Evaluated in simulation and on real-world Unitree Go2 hardware, APEX significantly improves training stability and motion tracking accuracy, while enabling reliable transfer under modified reward functions. The core innovations are (i) a dynamically decaying action prior that balances imitation and exploration, and (ii) a collaborative multi-critic constraint mechanism that stabilizes policy learning and improves trajectory fidelity.

Technology Category

Application Category

📝 Abstract
Learning natural, animal-like locomotion from demonstrations has become a core paradigm in legged robotics. Despite the recent advancements in motion tracking, most existing methods demand extensive tuning and rely on reference data during deployment, limiting adaptability. We present APEX (Action Priors enable Efficient Exploration), a plug-and-play extension to state-of-the-art motion tracking algorithms that eliminates any dependence on reference data during deployment, improves sample efficiency, and reduces parameter tuning effort. APEX integrates expert demonstrations directly into reinforcement learning (RL) by incorporating decaying action priors, which initially bias exploration toward expert demonstrations but gradually allow the policy to explore independently. This is combined with a multi-critic framework that balances task performance with motion style. Moreover, APEX enables a single policy to learn diverse motions and transfer reference-like styles across different terrains and velocities, while remaining robust to variations in reward design. We validate the effectiveness of our method through extensive experiments in both simulation and on a Unitree Go2 robot. By leveraging demonstrations to guide exploration during RL training, without imposing explicit bias toward them, APEX enables legged robots to learn with greater stability, efficiency, and generalization. We believe this approach paves the way for guidance-driven RL to boost natural skill acquisition in a wide array of robotic tasks, from locomotion to manipulation. Website and code: https://marmotlab.github.io/APEX/.
Problem

Research questions and friction points this paper is trying to address.

Eliminates dependence on reference data during robot deployment
Improves sample efficiency and reduces parameter tuning effort
Enables robust motion tracking across diverse terrains and velocities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses decaying action priors for efficient exploration
Employs multi-critic framework balancing task and style
Enables single policy learning across diverse conditions
🔎 Similar Papers
No similar papers found.
Shivam Sood
Shivam Sood
Student, National University of Singapore
RoboticsControlsReinforcement LearningLegged Locomotion
L
Laukik B. Nakhwa
Department of Mechanical Engineering, College of Design and Engineering, National University of Singapore, Singapore
S
Sun Ge
Department of Mechanical Engineering, College of Design and Engineering, National University of Singapore, Singapore
Yuhong Cao
Yuhong Cao
National University of Singapore
Robot learningPath Planing
Jin Cheng
Jin Cheng
Doctoral student at ETH Zürich
Loco-manipulationRobot Learning
F
Fatemah Zargarbashi
Department of Computer Science, ETH Zurich, Switzerland
Taerim Yoon
Taerim Yoon
PhD Candidate, Korea University
RoboticsArtifical Intelligence
Sungjoon Choi
Sungjoon Choi
Korea University
Robotics
S
Stelian Coros
Department of Computer Science, ETH Zurich, Switzerland
G
G. Sartoretti
Department of Mechanical Engineering, College of Design and Engineering, National University of Singapore, Singapore