Personalizable Long-Context Symbolic Music Infilling with MIDI-RWKV

📅 2025-06-16

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Current end-to-end automatic music generation systems lack support for iterative human–machine interaction, hindering computer-assisted composition. This paper proposes a personalized, multi-track, long-context symbolic music infilling method designed for human–AI co-creation on edge devices, enabling low-latency and high-consistency real-time collaboration. Our approach introduces three key contributions: (1) MIDI-RWKV, the first RWKV-7 linear-architecture-based model tailored for efficient modeling of multi-track MIDI sequences; (2) a state-level initialization fine-tuning strategy that achieves personalized adaptation with minimal training samples; and (3) a lightweight edge deployment framework. Quantitative and qualitative evaluations demonstrate significant improvements over baselines in musical coherence, expressiveness, and responsiveness. The model, training code, and inference toolkit are publicly released to ensure reproducibility and facilitate personalized music completion applications.

Technology Category

Application Category

📝 Abstract

Existing work in automatic music generation has primarily focused on end-to-end systems that produce complete compositions or continuations. However, because musical composition is typically an iterative process, such systems make it difficult to engage in the back-and-forth between human and machine that is essential to computer-assisted creativity. In this study, we address the task of personalizable, multi-track, long-context, and controllable symbolic music infilling to enhance the process of computer-assisted composition. We present MIDI-RWKV, a novel model based on the RWKV-7 linear architecture, to enable efficient and coherent musical cocreation on edge devices. We also demonstrate that MIDI-RWKV admits an effective method of finetuning its initial state for personalization in the very-low-sample regime. We evaluate MIDI-RWKV and its state tuning on several quantitative and qualitative metrics, and release model weights and code at https://github.com/christianazinn/MIDI-RWKV.

Problem

Research questions and friction points this paper is trying to address.

Enables personalized symbolic music infilling for creativity

Addresses multi-track long-context controllable music generation

Facilitates efficient co-creation on edge devices

Innovation

Methods, ideas, or system contributions that make the work stand out.

MIDI-RWKV model for symbolic music infilling

Linear RWKV-7 architecture for edge efficiency

State tuning for low-sample personalization

🔎 Similar Papers

No similar papers found.