SCOREQ: Speech Quality Assessment with Contrastive Regression

πŸ“… 2024-10-09
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing no-reference speech quality assessment (SQA) methods suffer from poor generalization, struggling to adapt to diverse acoustic characteristics and the continuous nature of subjective Mean Opinion Scores (MOS). To address this, we propose SCOREQβ€”a novel contrastive regression framework that introduces triplet loss into speech quality regression for the first time, overcoming the limitations of conventional L2 loss in modeling MOS continuity. By explicitly learning relative quality relationships via contrastive learning, and integrating a lightweight deep neural network architecture with an incremental evaluation strategy, SCOREQ significantly enhances cross-domain robustness. Evaluated on multi-domain speech datasets, SCOREQ consistently outperforms state-of-the-art no-reference metrics, effectively mitigating performance degradation caused by domain shift.

Technology Category

Application Category

πŸ“ Abstract
In this paper, we present SCOREQ, a novel approach for speech quality prediction. SCOREQ is a triplet loss function for contrastive regression that addresses the domain generalisation shortcoming exhibited by state of the art no-reference speech quality metrics. In the paper we: (i) illustrate the problem of L2 loss training failing at capturing the continuous nature of the mean opinion score (MOS) labels; (ii) demonstrate the lack of generalisation through a benchmarking evaluation across several speech domains; (iii) outline our approach and explore the impact of the architectural design decisions through incremental evaluation; (iv) evaluate the final model against state of the art models for a wide variety of data and domains. The results show that the lack of generalisation observed in state of the art speech quality metrics is addressed by SCOREQ. We conclude that using a triplet loss function for contrastive regression improves generalisation for speech quality prediction models but also has potential utility across a wide range of applications using regression-based predictive models.
Problem

Research questions and friction points this paper is trying to address.

Speech Quality Evaluation
Adaptability to Diverse Sound Types
Continuous Variations Understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

SCOREQ
comparative regression
speech quality prediction
πŸ”Ž Similar Papers
No similar papers found.