🤖 AI Summary
This study addresses the high error rates in cross-cultural clinical speech recognition—stemming from speaker accent variation and domain-specific medical terminology—by conducting the first systematic evaluation of large language models (LLMs) for error correction in clinical speech transcription across Nigeria, the UK, and the US. Methodologically, it employs outputs from mainstream automatic speech recognition (ASR) systems as input and integrates medical terminology constraints, accent-aware contextual modeling, and prompt-engineered LLM-based post-processing. Results demonstrate that LLMs significantly improve transcription accuracy across regions, achieving 62–79% correction rates for frequently misrecognized medical terms; they also delineate the operational boundaries and limitations of LLMs in low-resource accent scenarios. The core contribution lies in uncovering the coupling mechanism between region-specific ASR performance disparities and LLM correction capability, thereby providing empirical evidence and a practical technical pathway for deploying multilingual, culturally adaptive AI in healthcare settings.
📝 Abstract
The global adoption of Large Language Models (LLMs) in healthcare shows promise to enhance clinical workflows and improve patient outcomes. However, Automatic Speech Recognition (ASR) errors in critical medical terms remain a significant challenge. These errors can compromise patient care and safety if not detected. This study investigates the prevalence and impact of ASR errors in medical transcription in Nigeria, the United Kingdom, and the United States. By evaluating raw and LLM-corrected transcriptions of accented English in these regions, we assess the potential and limitations of LLMs to address challenges related to accents and medical terminology in ASR. Our findings highlight significant disparities in ASR accuracy across regions and identify specific conditions under which LLM corrections are most effective.