A Holistic Evaluation of Piano Sound Quality

📅 2023-10-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Current piano purchasing decisions lack objective, interpretable methods for evaluating intrinsic tonal quality—free from confounding effects of playing technique. Method: We propose the first comprehensive evaluation framework focused exclusively on pianos’ inherent timbral characteristics. Our approach establishes a subjective listening-based timbre classification system, integrates fine-tuned CNNs, ERB-scale spectrogram analysis for acoustic interpretability, and focal loss to address class imbalance. We further incorporate musicianship background as a covariate to quantify its effect on timbre discrimination ability. Contribution/Results: (1) Achieves 98.3% accuracy in automatic piano model classification; (2) First empirical evidence demonstrating statistically significant superiority of musically trained listeners in perceiving timbral differences; (3) Enhances model interpretability by aligning ERB-scale visualizations with perceptual labels, enabling acoustically grounded decision explanations. This work establishes a novel human-in-the-loop paradigm for objective–subjective integration in musical instrument timbre assessment.

📝 Abstract

This paper aims to develop a holistic evaluation method for piano sound quality to assist in purchasing decisions. Unlike previous studies that focused on the effect of piano performance techniques on sound quality, this study evaluates the inherent sound quality of different pianos. To derive quality evaluation systems, the study uses subjective questionnaires based on a piano sound quality dataset. The method selects the optimal piano classification models by comparing the fine-tuning results of different pre-training models of Convolutional Neural Networks (CNN). To improve the interpretability of the models, the study applies Equivalent Rectangular Bandwidth (ERB) analysis. The results reveal that musically trained individuals are better able to distinguish between the sound quality differences of different pianos. The best fine-tuned CNN pre-trained backbone achieves a high accuracy of 98.3% as the piano classifier. However, the dataset is limited, and the audio is sliced to increase its quantity, resulting in a lack of diversity and balance, so we use focal loss to reduce the impact of data imbalance. To optimize the method, the dataset will be expanded, or few-shot learning techniques will be employed in future research.

Problem

Research questions and friction points this paper is trying to address.

Develop holistic piano sound quality evaluation method

Compare CNN models for optimal piano classification

Address dataset limitations with focal loss

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses CNN fine-tuning for piano classification

Applies ERB analysis for model interpretability

Employs focal loss to handle data imbalance

🔎 Similar Papers

Are we there yet? A brief survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges