A Holistic Evaluation of Piano Sound Quality

📅 2023-10-07
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current piano purchasing decisions lack objective, interpretable methods for evaluating intrinsic tonal quality—free from confounding effects of playing technique. Method: We propose the first comprehensive evaluation framework focused exclusively on pianos’ inherent timbral characteristics. Our approach establishes a subjective listening-based timbre classification system, integrates fine-tuned CNNs, ERB-scale spectrogram analysis for acoustic interpretability, and focal loss to address class imbalance. We further incorporate musicianship background as a covariate to quantify its effect on timbre discrimination ability. Contribution/Results: (1) Achieves 98.3% accuracy in automatic piano model classification; (2) First empirical evidence demonstrating statistically significant superiority of musically trained listeners in perceiving timbral differences; (3) Enhances model interpretability by aligning ERB-scale visualizations with perceptual labels, enabling acoustically grounded decision explanations. This work establishes a novel human-in-the-loop paradigm for objective–subjective integration in musical instrument timbre assessment.
📝 Abstract
This paper aims to develop a holistic evaluation method for piano sound quality to assist in purchasing decisions. Unlike previous studies that focused on the effect of piano performance techniques on sound quality, this study evaluates the inherent sound quality of different pianos. To derive quality evaluation systems, the study uses subjective questionnaires based on a piano sound quality dataset. The method selects the optimal piano classification models by comparing the fine-tuning results of different pre-training models of Convolutional Neural Networks (CNN). To improve the interpretability of the models, the study applies Equivalent Rectangular Bandwidth (ERB) analysis. The results reveal that musically trained individuals are better able to distinguish between the sound quality differences of different pianos. The best fine-tuned CNN pre-trained backbone achieves a high accuracy of 98.3% as the piano classifier. However, the dataset is limited, and the audio is sliced to increase its quantity, resulting in a lack of diversity and balance, so we use focal loss to reduce the impact of data imbalance. To optimize the method, the dataset will be expanded, or few-shot learning techniques will be employed in future research.
Problem

Research questions and friction points this paper is trying to address.

Develop holistic piano sound quality evaluation method
Compare CNN models for optimal piano classification
Address dataset limitations with focal loss
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses CNN fine-tuning for piano classification
Applies ERB analysis for model interpretability
Employs focal loss to handle data imbalance
🔎 Similar Papers
No similar papers found.
M
Monan Zhou
Central Conservatory of Music, Beijing, China
Shangda Wu
Shangda Wu
Tencent
Symbolic Music GenerationMusic Information RetrievalMultimodal Learning
S
Shaohua Ji
Central Conservatory of Music, Beijing, China
Z
Zijin Li
Central Conservatory of Music, Beijing, China
W
Wei Li
Central Conservatory of Music, Beijing, China; Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai 201203, China