Can Large Language Models Reliably Correct Errors in Low-Resource ASR? A Contamination-Aware Case Study on West Frisian

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

174K/year
🤖 AI Summary
This study addresses the limited performance of automatic speech recognition (ASR) for low-resource languages and the unclear efficacy of large language models (LLMs) in generative error correction (GER), particularly under data contamination concerns. For the first time, the authors establish a rigorously decontaminated offline evaluation benchmark for West Frisian by leveraging both public and private corpora, enabling a systematic assessment of LLM-based GER on ASR outputs. Experimental results demonstrate that GER significantly improves ASR performance across most configurations, with GPT-5.1 even surpassing the oracle word error rate—a finding that underscores the robustness and effectiveness of LLMs in correcting ASR errors in low-resource settings.
📝 Abstract
Automatic speech recognition (ASR) has improved substantially in recent years, yet performance remains limited for low-resource languages. Large language models (LLMs) have shown promise for improving ASR through generative error correction (GER), but their effectiveness in low-resource settings remains underexplored. In addition, it remains unclear to what extent data contamination influences the reported improvements in LLM-based GER. This study investigates LLM-based GER for low-resource Frisian. In addition to a public corpus, we construct and use a Frisian offline dataset with non-public texts for evaluation to control for potential data contamination. Results show that GER improves ASR performance in most settings, with the best GPT-5.1 results surpassing oracle WERs. Comparable gains on the offline dataset indicate that improvements reflect true correction ability. We further provide a detailed error analysis revealing model correction patterns.
Problem

Research questions and friction points this paper is trying to address.

low-resource ASR
large language models
generative error correction
data contamination
West Frisian
Innovation

Methods, ideas, or system contributions that make the work stand out.

low-resource ASR
generative error correction
data contamination
large language models
Frisian
🔎 Similar Papers