Dexterous Robotic Piano Playing at Scale

📅 2025-11-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the dexterous manipulation challenge of bimanual piano playing—characterized by high-dimensional control, strong physical contact, and rapid dynamic response—through a large-scale autonomous learning paradigm that requires no human demonstrations. Methodologically, we introduce an optimal-transport-based automatic fingering planning algorithm and propose the RP1M++ framework, which integrates multi-agent reinforcement learning, a million-scale synthetic trajectory dataset, and a Flow Matching Transformer for scalable experience aggregation and imitation learning. Our key contributions are: (i) the first demonstration of zero-shot generalization to over one thousand diverse musical pieces; and (ii) the OmniPianist agent, trained within this framework, achieves human-level temporal precision, dynamic force control, and artistic expressivity on complex compositions. This establishes a novel, scalable learning paradigm for embodied intelligence in high-accuracy bimanual coordination tasks.

Technology Category

Application Category

📝 Abstract
Endowing robot hands with human-level dexterity has been a long-standing goal in robotics. Bimanual robotic piano playing represents a particularly challenging task: it is high-dimensional, contact-rich, and requires fast, precise control. We present OmniPianist, the first agent capable of performing nearly one thousand music pieces via scalable, human-demonstration-free learning. Our approach is built on three core components. First, we introduce an automatic fingering strategy based on Optimal Transport (OT), allowing the agent to autonomously discover efficient piano-playing strategies from scratch without demonstrations. Second, we conduct large-scale Reinforcement Learning (RL) by training more than 2,000 agents, each specialized in distinct music pieces, and aggregate their experience into a dataset named RP1M++, consisting of over one million trajectories for robotic piano playing. Finally, we employ a Flow Matching Transformer to leverage RP1M++ through large-scale imitation learning, resulting in the OmniPianist agent capable of performing a wide range of musical pieces. Extensive experiments and ablation studies highlight the effectiveness and scalability of our approach, advancing dexterous robotic piano playing at scale.
Problem

Research questions and friction points this paper is trying to address.

Developing dexterous robotic hands for complex piano playing
Enabling autonomous piano performance without human demonstrations
Scaling robotic control to handle high-dimensional, contact-rich tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal Transport for autonomous fingering strategy
Large-scale Reinforcement Learning with specialized agents
Flow Matching Transformer for imitation learning
🔎 Similar Papers
No similar papers found.
L
Le Chen
Empirical Inference Department of Max Planck Institute for Intelligent Systems, 72076 Tübingen, Germany
Y
Yi Zhao
Aalto University, 02150 Espoo, Finland
J
Jan Schneider
Empirical Inference Department of Max Planck Institute for Intelligent Systems, 72076 Tübingen, Germany
Quankai Gao
Quankai Gao
University of Southern California
Computer Vision
Simon Guist
Simon Guist
Empirical Inference Department of Max Planck Institute for Intelligent Systems, 72076 Tübingen, Germany
C
Cheng Qian
Imperial College London, SW7 2AZ, London, United Kingdom
Juho Kannala
Juho Kannala
Associate Professor, Aalto University & University of Oulu, Finland
Computer VisionMachine Learning
B
Bernhard Scholkopf
Empirical Inference Department of Max Planck Institute for Intelligent Systems, 72076 Tübingen, Germany
J
J. Pajarinen
Aalto University, 02150 Espoo, Finland
D
Dieter Buchler
Empirical Inference Department of Max Planck Institute for Intelligent Systems, 72076 Tübingen, Germany