What do the metrics mean? A critical analysis of the use of Automated Evaluation Metrics in Interpreting

📅 2026-01-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing automatic evaluation metrics, such as BLEU and TER, commonly overlook communicative context in interpreting quality assessment, thereby failing to accurately reflect the real-world effectiveness of interpretation. This study addresses this gap by incorporating communicative context as a core dimension, integrating theoretical insights from interpreting studies with empirical analyses of automatic metrics to systematically examine their applicability and limitations in authentic interpreting scenarios. The findings reveal that context-independent automatic metrics cannot reliably evaluate interpreting quality in isolation and must be complemented by contextual factors for a valid assessment. By foregrounding the situated nature of interpreting, this work provides both theoretical grounding and methodological innovation for developing evaluation frameworks that better align with the communicative essence of interpreting practice.

Technology Category

Application Category

📝 Abstract

With the growth of interpreting technologies, from remote interpreting and Computer-Aided Interpreting to automated speech translation and interpreting avatars, there is now a high demand for ways to quickly and efficiently measure the quality of any interpreting delivered. A range of approaches to fulfil the need for quick and efficient quality measurement have been proposed, each involving some measure of automation. This article examines these recently-proposed quality measurement methods and will discuss their suitability for measuring the quality of authentic interpreting practice, whether delivered by humans or machines, concluding that automatic metrics as currently proposed cannot take into account the communicative context and thus are not viable measures of the quality of any interpreting provision when used on their own. Across all attempts to measure or even categorise quality in Interpreting Studies, the contexts in which interpreting takes place have become fundamental to the final analysis.

Problem

Research questions and friction points this paper is trying to address.

automated evaluation metrics

interpreting quality

communicative context

speech translation

quality measurement

Innovation

Methods, ideas, or system contributions that make the work stand out.

automated evaluation metrics

interpreting quality

communicative context