High-Resolution Sustain Pedal Depth Estimation from Piano Audio Across Room Acoustics

📅 2025-07-05

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Traditional piano sustain pedal detection is limited to binary classification, failing to capture the continuous, nuanced depth control inherent in expressive performance and exhibiting poor robustness to variations in room acoustics. This paper introduces the first Transformer-based framework for continuous pedal depth estimation, preserving high-accuracy binary detection while enabling musically semantically plausible fine-grained depth prediction. The method is trained on synthetically generated multi-room reverberant audio and evaluated via leave-one-room-out cross-environment generalization. A quantitative analysis quantifies the impact of reverberation on estimation bias. Experiments demonstrate high accuracy in continuous depth estimation, significantly improving the fidelity of musical expressivity reconstruction. However, reverberation consistently induces systematic overestimation, revealing acoustic generalization as a critical challenge for real-world deployment.

Technology Category

Application Category

📝 Abstract

Piano sustain pedal detection has previously been approached as a binary on/off classification task, limiting its application in real-world piano performance scenarios where pedal depth significantly influences musical expression. This paper presents a novel approach for high-resolution estimation that predicts continuous pedal depth values. We introduce a Transformer-based architecture that not only matches state-of-the-art performance on the traditional binary classification task but also achieves high accuracy in continuous pedal depth estimation. Furthermore, by estimating continuous values, our model provides musically meaningful predictions for sustain pedal usage, whereas baseline models struggle to capture such nuanced expressions with their binary detection approach. Additionally, this paper investigates the influence of room acoustics on sustain pedal estimation using a synthetic dataset that includes varied acoustic conditions. We train our model with different combinations of room settings and test it in an unseen new environment using a "leave-one-out" approach. Our findings show that the two baseline models and ours are not robust to unseen room conditions. Statistical analysis further confirms that reverberation influences model predictions and introduces an overestimation bias.

Problem

Research questions and friction points this paper is trying to address.

Estimating continuous piano sustain pedal depth from audio

Improving musical expression via high-resolution pedal prediction

Assessing room acoustics impact on pedal estimation accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based continuous pedal depth estimation

Synthetic dataset for varied room acoustics

Leave-one-out testing in unseen environments

🔎 Similar Papers

Can Audio Reveal Music Performance Difficulty? Insights From the Piano Syllabus Dataset