A Benchmark of French ASR Systems Based on Error Severity

📅 2025-01-18

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

Traditional word error rate (WER) metrics for French automatic speech recognition (ASR) evaluation overlook the semantic impact of errors, limiting their ability to reflect human comprehension. Method: We introduce the first error-severity–aware French ASR benchmark, proposing a linguistics-driven, four-level severity classification framework centered on content words. Our approach integrates context-sensitive linguistic rules and intelligibility-oriented criteria to systematically annotate and quantify error impact. Contribution/Results: We evaluate ten state-of-the-art French ASR systems—including both HMM- and end-to-end–based models—enabling, for the first time, unified modeling of linguistic depth and user reading experience in French ASR assessment. Results reveal substantial cross-model variation in semantic fidelity and textual readability, identifying the system most aligned with human understanding patterns. This work establishes a new evaluation paradigm and methodological foundation for semantics-aware ASR assessment.

Technology Category

Application Category

📝 Abstract

Automatic Speech Recognition (ASR) transcription errors are commonly assessed using metrics that compare them with a reference transcription, such as Word Error Rate (WER), which measures spelling deviations from the reference, or semantic score-based metrics. However, these approaches often overlook what is understandable to humans when interpreting transcription errors. To address this limitation, a new evaluation is proposed that categorizes errors into four levels of severity, further divided into subtypes, based on objective linguistic criteria, contextual patterns, and the use of content words as the unit of analysis. This metric is applied to a benchmark of 10 state-of-the-art ASR systems on French language, encompassing both HMM-based and end-to-end models. Our findings reveal the strengths and weaknesses of each system, identifying those that provide the most comfortable reading experience for users.

Problem

Research questions and friction points this paper is trying to address.

French Speech Recognition

Error Severity Assessment

User Experience Measurement

Innovation

Methods, ideas, or system contributions that make the work stand out.

French Speech Recognition

Error Severity Grading

User Experience Improvement

🔎 Similar Papers

No similar papers found.