🤖 AI Summary
This study addresses key challenges in human–AI real-time musical improvisation—namely, high latency, opaque intent, and weak collaboration—by proposing a low-latency, interpretable AI co-creative framework. Methodologically, we develop a lightweight Transformer-based generative model, fine-tuned via reinforcement learning to jointly optimize musical coherence and real-time responsiveness; introduce an “expectation visualization” mechanism that explicitly renders the model’s generative plan as actionable motion trajectories; and integrate a Web-based audio streaming pipeline with optimized interaction protocols to achieve end-to-end latency under 50 ms. A controlled user study with professional musicians demonstrates statistically significant improvements in ensemble naturalness, performer engagement, and creative fluency (p < 0.01). The framework establishes a novel paradigm for explainable, real-time AI–human musical collaboration, advancing both technical performance and human-centered design in interactive AI music systems.
📝 Abstract
Recent advances in generative artificial intelligence (AI) have created models capable of high-quality musical content generation. However, little consideration is given to how to use these models for real-time or cooperative jamming musical applications because of crucial required features: low latency, the ability to communicate planned actions, and the ability to adapt to user input in real-time. To support these needs, we introduce ReaLJam, an interface and protocol for live musical jamming sessions between a human and a Transformer-based AI agent trained with reinforcement learning. We enable real-time interactions using the concept of anticipation, where the agent continually predicts how the performance will unfold and visually conveys its plan to the user. We conduct a user study where experienced musicians jam in real-time with the agent through ReaLJam. Our results demonstrate that ReaLJam enables enjoyable and musically interesting sessions, and we uncover important takeaways for future work.