🤖 AI Summary
This work addresses the core challenge of autonomous skill discovery for real-world robots: continuously exploring diverse, high-performance skills without explicit supervision, predefined skill spaces, while ensuring safety and data efficiency. To this end, we propose URSA—a Quality-Diversity (QD) reinforcement learning framework that integrates unsupervised behavioral representation learning with online policy optimization. URSA is the first to achieve end-to-end, fully autonomous, and data-efficient evolution of diverse locomotion skills on a real Unitree A1 quadruped robot without human intervention. It supports both heuristic-guided and fully unsupervised operation modes and enables downstream task transfer. Experiments demonstrate emergent robust walking behaviors in both simulation and reality. Under nine limb-damage scenarios, URSA outperforms baselines in five—three of which achieve state-of-the-art performance on the physical robot—validating the strong generalization and adaptability of the learned skill repertoire.
📝 Abstract
Autonomous skill discovery aims to enable robots to acquire diverse behaviors without explicit supervision. Learning such behaviors directly on physical hardware remains challenging due to safety and data efficiency constraints. Existing methods, including Quality-Diversity Actor-Critic (QDAC), require manually defined skill spaces and carefully tuned heuristics, limiting real-world applicability. We propose Unsupervised Real-world Skill Acquisition (URSA), an extension of QDAC that enables robots to autonomously discover and master diverse, high-performing skills directly in the real world. We demonstrate that URSA successfully discovers diverse locomotion skills on a Unitree A1 quadruped in both simulation and the real world. Our approach supports both heuristic-driven skill discovery and fully unsupervised settings. We also show that the learned skill repertoire can be reused for downstream tasks such as real-world damage adaptation, where URSA outperforms all baselines in 5 out of 9 simulated and 3 out of 5 real-world damage scenarios. Our results establish a new framework for real-world robot learning that enables continuous skill discovery with limited human intervention, representing a significant step toward more autonomous and adaptable robotic systems. Demonstration videos are available at http://adaptive-intelligent-robotics.github.io/URSA .