Can Large Language Models Reliably Correct Errors in Low-Resource ASR? A Contamination-Aware Case Study on West Frisian

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This study addresses the limited performance of automatic speech recognition (ASR) for low-resource languages and the unclear efficacy of large language models (LLMs) in generative error correction (GER), particularly under data contamination concerns. For the first time, the authors establish a rigorously decontaminated offline evaluation benchmark for West Frisian by leveraging both public and private corpora, enabling a systematic assessment of LLM-based GER on ASR outputs. Experimental results demonstrate that GER significantly improves ASR performance across most configurations, with GPT-5.1 even surpassing the oracle word error rate—a finding that underscores the robustness and effectiveness of LLMs in correcting ASR errors in low-resource settings.

📝 Abstract

Automatic speech recognition (ASR) has improved substantially in recent years, yet performance remains limited for low-resource languages. Large language models (LLMs) have shown promise for improving ASR through generative error correction (GER), but their effectiveness in low-resource settings remains underexplored. In addition, it remains unclear to what extent data contamination influences the reported improvements in LLM-based GER. This study investigates LLM-based GER for low-resource Frisian. In addition to a public corpus, we construct and use a Frisian offline dataset with non-public texts for evaluation to control for potential data contamination. Results show that GER improves ASR performance in most settings, with the best GPT-5.1 results surpassing oracle WERs. Comparable gains on the offline dataset indicate that improvements reflect true correction ability. We further provide a detailed error analysis revealing model correction patterns.

Problem

Research questions and friction points this paper is trying to address.

low-resource ASR

large language models

generative error correction

data contamination

West Frisian

Innovation

Methods, ideas, or system contributions that make the work stand out.

low-resource ASR

generative error correction

data contamination