🤖 AI Summary
To address the challenge of simultaneously achieving real-time generation and interactive control in AI-assisted music composition, this paper introduces RealTime Music Models—a novel human-AI co-creation paradigm enabling low-latency continuous generation and synchronous user intervention. Methodologically, we design a lightweight streaming architecture that integrates multimodal conditional inputs (text and audio) and release, for the first time, an extensible open-weight model alongside a public API. Our contributions are threefold: (1) We introduce Magenta RealTime and Lyria RealTime—models that significantly outperform existing open-source alternatives on automated evaluation metrics; (2) We enable millisecond-level latency for real-time, multi-dimensional control—including style, structure, and dynamics; (3) We establish an interaction-driven paradigm for live music creation, providing a reproducible and deployable technical pathway for generative music systems.
📝 Abstract
We introduce a new class of generative models for music called live music models that produce a continuous stream of music in real-time with synchronized user control. We release Magenta RealTime, an open-weights live music model that can be steered using text or audio prompts to control acoustic style. On automatic metrics of music quality, Magenta RealTime outperforms other open-weights music generation models, despite using fewer parameters and offering first-of-its-kind live generation capabilities. We also release Lyria RealTime, an API-based model with extended controls, offering access to our most powerful model with wide prompt coverage. These models demonstrate a new paradigm for AI-assisted music creation that emphasizes human-in-the-loop interaction for live music performance.