A Study on Zero-Shot Non-Intrusive Speech Intelligibility for Hearing Aids Using Large Language Models

📅 2025-09-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of zero-shot, non-invasive subjective speech intelligibility assessment for hearing aid (HA) users. We propose GPT-Whisper-HA, a novel framework that integrates MSBG-based hearing loss simulation and NAL-R gain compensation to construct dual automatic speech recognition (ASR) pathways using Whisper. GPT-4o jointly models individualized auditory characteristics and ASR transcripts to directly predict subjective intelligibility scores. To our knowledge, this is the first work to incorporate large language models (LLMs) into zero-shot, non-invasive HA speech evaluation—requiring neither subjective user annotations nor speech signal reconstruction. Experiments demonstrate that GPT-Whisper-HA reduces relative root-mean-square error by 2.59% over the baseline GPT-Whisper, significantly improving prediction accuracy of subjective intelligibility. These results validate the efficacy and generalizability of LLMs in personalized auditory assessment.

Technology Category

Application Category

📝 Abstract
This work focuses on zero-shot non-intrusive speech assessment for hearing aids (HA) using large language models (LLMs). Specifically, we introduce GPT-Whisper-HA, an extension of GPT-Whisper, a zero-shot non-intrusive speech assessment model based on LLMs. GPT-Whisper-HA is designed for speech assessment for HA, incorporating MSBG hearing loss and NAL-R simulations to process audio input based on each individual's audiogram, two automatic speech recognition (ASR) modules for audio-to-text representation, and GPT-4o to predict two corresponding scores, followed by score averaging for the final estimated score. Experimental results indicate that GPT-Whisper-HA achieves a 2.59% relative root mean square error (RMSE) improvement over GPT-Whisper, confirming the potential of LLMs for zero-shot speech assessment in predicting subjective intelligibility for HA users.
Problem

Research questions and friction points this paper is trying to address.

Zero-shot non-intrusive speech intelligibility assessment for hearing aids
Using large language models to predict subjective intelligibility scores
Incorporating personalized hearing loss simulations and ASR modules
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPT-Whisper-HA extension for hearing aid assessment
Incorporates MSBG and NAL-R simulations for personalized processing
Uses dual ASR modules with GPT-4o for score prediction
🔎 Similar Papers
No similar papers found.