A Study on Zero-Shot Non-Intrusive Speech Intelligibility for Hearing Aids Using Large Language Models

📅 2025-09-03

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the challenge of zero-shot, non-invasive subjective speech intelligibility assessment for hearing aid (HA) users. We propose GPT-Whisper-HA, a novel framework that integrates MSBG-based hearing loss simulation and NAL-R gain compensation to construct dual automatic speech recognition (ASR) pathways using Whisper. GPT-4o jointly models individualized auditory characteristics and ASR transcripts to directly predict subjective intelligibility scores. To our knowledge, this is the first work to incorporate large language models (LLMs) into zero-shot, non-invasive HA speech evaluation—requiring neither subjective user annotations nor speech signal reconstruction. Experiments demonstrate that GPT-Whisper-HA reduces relative root-mean-square error by 2.59% over the baseline GPT-Whisper, significantly improving prediction accuracy of subjective intelligibility. These results validate the efficacy and generalizability of LLMs in personalized auditory assessment.

Technology Category

Application Category

📝 Abstract

This work focuses on zero-shot non-intrusive speech assessment for hearing aids (HA) using large language models (LLMs). Specifically, we introduce GPT-Whisper-HA, an extension of GPT-Whisper, a zero-shot non-intrusive speech assessment model based on LLMs. GPT-Whisper-HA is designed for speech assessment for HA, incorporating MSBG hearing loss and NAL-R simulations to process audio input based on each individual's audiogram, two automatic speech recognition (ASR) modules for audio-to-text representation, and GPT-4o to predict two corresponding scores, followed by score averaging for the final estimated score. Experimental results indicate that GPT-Whisper-HA achieves a 2.59% relative root mean square error (RMSE) improvement over GPT-Whisper, confirming the potential of LLMs for zero-shot speech assessment in predicting subjective intelligibility for HA users.

Problem

Research questions and friction points this paper is trying to address.

Zero-shot non-intrusive speech intelligibility assessment for hearing aids

Using large language models to predict subjective intelligibility scores

Incorporating personalized hearing loss simulations and ASR modules

Innovation

Methods, ideas, or system contributions that make the work stand out.

GPT-Whisper-HA extension for hearing aid assessment

Incorporates MSBG and NAL-R simulations for personalized processing

Uses dual ASR modules with GPT-4o for score prediction

🔎 Similar Papers

No similar papers found.

Authors to Follow