Measuring the Effect of Transcription Noise on Downstream Language Understanding Tasks

📅 2025-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically investigates how automatic speech recognition (ASR) transcription noise affects downstream natural language understanding tasks—specifically dialogue summarization, intent classification, and slot filling. We propose the first standardized evaluation framework spanning multiple noise types (substitution, deletion, insertion) and intensities, integrating controlled noise injection, comparative analysis of transcription cleaning techniques, and multi-task benchmarking across T5, BART, LED, and Flan-T5. Key findings reveal: (1) models exhibit distinct noise tolerance thresholds; (2) substitution errors are most detrimental to performance; and (3) cleaning methods yield substantial gains only under high-noise conditions. Crucially, this work is the first to characterize fine-grained robustness disparities across models with respect to specific ASR error types. The results provide reproducible empirical evidence and methodological guidance for designing and optimizing robust spoken language understanding (SLU) systems.

Technology Category

Application Category

📝 Abstract
With the increasing prevalence of recorded human speech, spoken language understanding (SLU) is essential for its efficient processing. In order to process the speech, it is commonly transcribed using automatic speech recognition technology. This speech-to-text transition introduces errors into the transcripts, which subsequently propagate to downstream NLP tasks, such as dialogue summarization. While it is known that transcript noise affects downstream tasks, a systematic approach to analyzing its effects across different noise severities and types has not been addressed. We propose a configurable framework for assessing task models in diverse noisy settings, and for examining the impact of transcript-cleaning techniques. The framework facilitates the investigation of task model behavior, which can in turn support the development of effective SLU solutions. We exemplify the utility of our framework on three SLU tasks and four task models, offering insights regarding the effect of transcript noise on tasks in general and models in particular. For instance, we find that task models can tolerate a certain level of noise, and are affected differently by the types of errors in the transcript.
Problem

Research questions and friction points this paper is trying to address.

Analyzing transcription noise impact on SLU tasks
Developing configurable framework for noise assessment
Investigating transcript-cleaning techniques' effectiveness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Configurable framework for noise assessment
Examines transcript-cleaning techniques impact
Analyzes task models in noisy settings
🔎 Similar Papers
No similar papers found.