🤖 AI Summary
This work addresses the dexterous manipulation challenge of bimanual piano playing—characterized by high-dimensional control, strong physical contact, and rapid dynamic response—through a large-scale autonomous learning paradigm that requires no human demonstrations. Methodologically, we introduce an optimal-transport-based automatic fingering planning algorithm and propose the RP1M++ framework, which integrates multi-agent reinforcement learning, a million-scale synthetic trajectory dataset, and a Flow Matching Transformer for scalable experience aggregation and imitation learning. Our key contributions are: (i) the first demonstration of zero-shot generalization to over one thousand diverse musical pieces; and (ii) the OmniPianist agent, trained within this framework, achieves human-level temporal precision, dynamic force control, and artistic expressivity on complex compositions. This establishes a novel, scalable learning paradigm for embodied intelligence in high-accuracy bimanual coordination tasks.
📝 Abstract
Endowing robot hands with human-level dexterity has been a long-standing goal in robotics. Bimanual robotic piano playing represents a particularly challenging task: it is high-dimensional, contact-rich, and requires fast, precise control. We present OmniPianist, the first agent capable of performing nearly one thousand music pieces via scalable, human-demonstration-free learning. Our approach is built on three core components. First, we introduce an automatic fingering strategy based on Optimal Transport (OT), allowing the agent to autonomously discover efficient piano-playing strategies from scratch without demonstrations. Second, we conduct large-scale Reinforcement Learning (RL) by training more than 2,000 agents, each specialized in distinct music pieces, and aggregate their experience into a dataset named RP1M++, consisting of over one million trajectories for robotic piano playing. Finally, we employ a Flow Matching Transformer to leverage RP1M++ through large-scale imitation learning, resulting in the OmniPianist agent capable of performing a wide range of musical pieces. Extensive experiments and ablation studies highlight the effectiveness and scalability of our approach, advancing dexterous robotic piano playing at scale.