Beyond Word Error Rate: Auditing the Diversity Tax in Speech Recognition through Dataset Cartography

๐Ÿ“… 2026-03-05
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work proposes a novel auditing framework for automatic speech recognition (ASR) systems that addresses the limitations of traditional word error rate (WER) evaluation, which often fails to capture semantic fidelity and obscures systemic biases against marginalized and non-canonical speakers. By integrating a sample difficulty index (SDI) with data cartography techniques and incorporating semantic metricsโ€”such as Embedding Error Rate (EmbER) and Semantic Distance (SemDist)โ€”the framework systematically uncovers model biases and performance disparities invisible to WER. This approach enables interpretable, prospective fairness analyses of ASR systems, offering a practical tool for detecting and mitigating bias prior to deployment.

Technology Category

Application Category

๐Ÿ“ Abstract
Automatic speech recognition (ASR) systems are predominantly evaluated using the Word Error Rate (WER). However, raw token-level metrics fail to capture semantic fidelity and routinely obscures the `diversity tax', the disproportionate burden on marginalized and atypical speaker due to systematic recognition failures. In this paper, we explore the limitations of relying solely on lexical counts by systematically evaluating a broader class of non-linear and semantic metrics. To enable rigorous model auditing, we introduce the sample difficulty index (SDI), a novel metric that quantifies how intrinsic demographic and acoustic factors drive model failure. By mapping SDI on data cartography, we demonstrate that metrics EmbER and SemDist expose hidden systemic biases and inter-model disagreements that WER ignores. Finally, our findings are the first steps towards a robust audit framework for prospective safety analysis, empowering developers to audit and mitigate ASR disparities prior to deployment.
Problem

Research questions and friction points this paper is trying to address.

Word Error Rate
diversity tax
speech recognition
systemic bias
semantic fidelity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sample Difficulty Index
Dataset Cartography
Semantic Metrics
Diversity Tax
ASR Bias Audit
T
Ting-Hui Cheng
Department of Applied Mathematics and Computer Science, Technical University of Denmark, Denmark
Line H. Clemmensen
Line H. Clemmensen
University of Copenhagen
Machine learningmultivariate statisticsstatistical modellingsparse modelling
S
Sneha Das
Department of Applied Mathematics and Computer Science, Technical University of Denmark, Denmark