π€ AI Summary
Current objective assessment of speech intelligibility for hearing-impaired listeners relies heavily on clean reference signalsβa major bottleneck in clinical and hearing-aid fitting scenarios. To address this, we propose DeepGESI, the first fully non-intrusive deep learning model for reference-free prediction of the hearing-loss-specific metric GESI (Generalized Estimation of Speech Intelligibility). DeepGESI takes only distorted speech as input and performs end-to-end regression, jointly modeling time-frequency acoustic representations and hearing-loss perception priors. Unlike conventional reference-dependent methods, DeepGESI enables pure no-reference GESI estimation, significantly enhancing practicality in real-world applications. Evaluated on the CPC2 dataset, it achieves high correlation with human-rated GESI (Spearman Ο > 0.92) and accelerates inference by over 20Γ compared to prior approaches. This work establishes a new paradigm for objective, efficient, and personalized speech intelligibility assessment tailored to hearing impairment.
π Abstract
Speech intelligibility assessment is essential for many speech-related applications. However, most objective intelligibility metrics are intrusive, as they require clean reference speech in addition to the degraded or processed signal for evaluation. Furthermore, existing metrics such as STOI are primarily designed for normal hearing listeners, and their predictive accuracy for hearing impaired speech intelligibility remains limited. On the other hand, the GESI (Gammachirp Envelope Similarity Index) can be used to estimate intelligibility for hearing-impaired listeners, but it is also intrusive, as it depends on reference signals. This requirement limits its applicability in real-world scenarios.
To overcome this limitation, this study proposes DeepGESI, a non-intrusive deep learning-based model capable of accurately and efficiently predicting the speech intelligibility of hearing-impaired listeners without requiring any clean reference speech. Experimental results demonstrate that, under the test conditions of the 2nd Clarity Prediction Challenge(CPC2) dataset, the GESI scores predicted by DeepGESI exhibit a strong correlation with the actual GESI scores. In addition, the proposed model achieves a substantially faster prediction speed compared to conventional methods.