VioPTT: Violin Technique-Aware Transcription from Synthetic Data Augmentation

📅 2025-09-28

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Existing violin automatic music transcription methods suffer from the absence of joint modeling of pitch and playing techniques, heavy reliance on manual annotations, and poor generalization. Method: This paper proposes a lightweight end-to-end multi-task model that simultaneously detects note onsets/offsets, estimates pitch, and classifies six canonical playing techniques (e.g., vibrato, bow change, harmonics) within a unified framework. To address the scarcity of real-world labeled data, we introduce MOSA-VPT—a high-fidelity synthetic dataset—and design a physics-informed data augmentation strategy to generate audio with precise technique annotations. Contribution/Results: The model is optimized via joint multi-task training and achieves state-of-the-art performance on real recordings: 89.3% F1-score for technique classification—significantly outperforming prior approaches—while requiring no manual annotation.

Technology Category

Application Category

📝 Abstract

While automatic music transcription is well-established in music information retrieval, most models are limited to transcribing pitch and timing information from audio, and thus omit crucial expressive and instrument-specific nuances. One example is playing technique on the violin, which affords its distinct palette of timbres for maximal emotional impact. Here, we propose extbf{VioPTT} (Violin Playing Technique-aware Transcription), a lightweight, end-to-end model that directly transcribes violin playing technique in addition to pitch onset and offset. Furthermore, we release extbf{MOSA-VPT}, a novel, high-quality synthetic violin playing technique dataset to circumvent the need for manually labeled annotations. Leveraging this dataset, our model demonstrated strong generalization to real-world note-level violin technique recordings in addition to achieving state-of-the-art transcription performance. To our knowledge, VioPTT is the first to jointly combine violin transcription and playing technique prediction within a unified framework.

Problem

Research questions and friction points this paper is trying to address.

Transcribing violin playing techniques beyond pitch

Overcoming limited manually labeled training data

Unifying violin transcription and technique prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight end-to-end model for violin technique transcription

Synthetic dataset to replace manual annotation requirements

Unified framework combining pitch and technique prediction

🔎 Similar Papers

No similar papers found.