Context-Aware Monolingual Human Evaluation of Machine Translation

📅 2025-04-10

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

Machine translation (MT) quality evaluation traditionally relies on bilingual assessments, which are costly and less reflective of real-world monolingual user scenarios. Method: This study proposes and empirically validates a context-aware monolingual human evaluation paradigm. We design a context-enhanced prompting framework wherein professional translators assign quality scores and annotate errors in the target language only, supplemented by qualitative feedback and statistical testing (p < 0.05) against bilingual baselines. Contribution/Results: Monolingual evaluation achieves high agreement with bilingual evaluation across system-level scoring, pairwise comparisons, and error-type distributions. It improves evaluation efficiency by ~40% while better approximating authentic user conditions. This work provides the first empirical validation of high-fidelity monolingual MT evaluation, establishing a scalable, cost-effective, and practically viable methodology for MT quality assessment.

Technology Category

Application Category

📝 Abstract

This paper explores the potential of context-aware monolingual human evaluation for assessing machine translation (MT) when no source is given for reference. To this end, we compare monolingual with bilingual evaluations (with source text), under two scenarios: the evaluation of a single MT system, and the comparative evaluation of pairwise MT systems. Four professional translators performed both monolingual and bilingual evaluations by assigning ratings and annotating errors, and providing feedback on their experience. Our findings suggest that context-aware monolingual human evaluation achieves comparable outcomes to human bilingual evaluations, and suggest the feasibility and potential of monolingual evaluation as an efficient approach to assessing MT.

Problem

Research questions and friction points this paper is trying to address.

Assessing machine translation without source reference

Comparing monolingual and bilingual evaluation methods

Evaluating feasibility of context-aware monolingual assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Context-aware monolingual human evaluation

Compares monolingual and bilingual evaluations

Achieves comparable outcomes to bilingual evaluations

🔎 Similar Papers

No similar papers found.