Score-Informed BiLSTM Correction for Refining MIDI Velocity in Automatic Piano Transcription

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

To address inaccurate MIDI note velocity estimation in Automatic Music Transcription (AMT), this paper proposes a score-guided, incremental BiLSTM-based correction method. Rather than reconstructing temporal structure, the approach jointly models raw audio features and reference MIDI scores to explicitly estimate and correct velocity errors. Distinct from end-to-end re-generation, it is the first work to incorporate BiLSTM into the AMT post-processing stage, enabling plug-and-play enhancement of mainstream systems such as HPT. Experiments on a high-resolution piano transcription benchmark demonstrate a significant improvement in velocity estimation accuracy—reducing Mean Absolute Error (MAE) by 18.7%—while maintaining compatibility with existing pipelines. Although the method does not surpass current state-of-the-art (SOTA) performance, it validates the effectiveness and generalizability of the “score-aware + sequential modeling” correction paradigm for AMT.

Technology Category

Application Category

📝 Abstract

MIDI is a modern standard for storing music, recording how musical notes are played. Many piano performances have corresponding MIDI scores available online. Some of these are created by the original performer, recording on an electric piano alongside the audio, while others are through manual transcription. In recent years, automatic music transcription (AMT) has rapidly advanced, enabling machines to transcribe MIDI from audio. However, these transcriptions often require further correction. Assuming a perfect timing correction, we focus on the loudness correction in terms of MIDI velocity (a parameter in MIDI for loudness control). This task can be approached through score-informed MIDI velocity estimation, which has undergone several developments. While previous approaches introduced specifically built models to re-estimate MIDI velocity, thereby replacing AMT estimates, we propose a BiLSTM correction module to refine AMT-estimated velocity. Although we did not reach state-of-the-art performance, we validated our method on the well-known AMT system, the high-resolution piano transcription (HPT), and achieved significant improvements.

Problem

Research questions and friction points this paper is trying to address.

Refining MIDI velocity in automatic piano transcription

Improving loudness correction using BiLSTM module

Enhancing AMT-estimated velocity without replacing it

Innovation

Methods, ideas, or system contributions that make the work stand out.

BiLSTM correction module for MIDI velocity

Score-informed refinement of AMT estimates

Validation on high-resolution piano transcription

🔎 Similar Papers

No similar papers found.