Gait in Eight: Efficient On-Robot Learning for Omnidirectional Quadruped Locomotion

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of computationally constrained real-time omnidirectional locomotion learning for octopod robots. We propose CrossQ, a lightweight off-policy reinforcement learning algorithm, and present the first successful on-robot deployment enabling end-to-end online training. Our method integrates joint-target-position predictive control with a central pattern generator (CPG) to establish an embedded-real-time training framework. Experiments demonstrate robust omnidirectional walking acquisition within just 8 minutes of real-robot interaction—accelerating learning by over an order of magnitude compared to prior approaches—and achieve high-speed, agile, and naturally stable gait adaptation across diverse indoor and outdoor terrains. Key contributions include: (1) the first on-robot RL system tailored for omnidirectional locomotion; (2) CrossQ’s design achieving high sample efficiency and low computational overhead; and (3) a full software–hardware co-designed architecture supporting real-time embedded training.

Technology Category

Application Category

📝 Abstract
On-robot Reinforcement Learning is a promising approach to train embodiment-aware policies for legged robots. However, the computational constraints of real-time learning on robots pose a significant challenge. We present a framework for efficiently learning quadruped locomotion in just 8 minutes of raw real-time training utilizing the sample efficiency and minimal computational overhead of the new off-policy algorithm CrossQ. We investigate two control architectures: Predicting joint target positions for agile, high-speed locomotion and Central Pattern Generators for stable, natural gaits. While prior work focused on learning simple forward gaits, our framework extends on-robot learning to omnidirectional locomotion. We demonstrate the robustness of our approach in different indoor and outdoor environments.
Problem

Research questions and friction points this paper is trying to address.

Efficient on-robot learning for quadruped locomotion
Overcoming computational constraints in real-time training
Extending learning to omnidirectional locomotion in varied environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient on-robot learning in 8 minutes
Utilizes CrossQ algorithm for sample efficiency
Supports omnidirectional quadruped locomotion
🔎 Similar Papers
No similar papers found.
Nico Bohlinger
Nico Bohlinger
PhD Student, TU Darmstadt
Reinforcement Learning
J
Jonathan Kinzel
Department of Computer Science, Technical University of Darmstadt, Germany
Daniel Palenicek
Daniel Palenicek
PhD student at Technische Universität Darmstadt
Reinforcement LearningMachine LearningArtificial Intelligence
L
Lukasz Antczak
MAB Robtics, Poznan, Poland
J
Jan Peters
German Research Center for AI (DFKI), Research Department: Systems AI for Robot Learning