Is Human Annotation Necessary? Iterative MBR Distillation for Error Span Detection in Machine Translation

📅 2026-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of error span detection in machine translation evaluation, which typically relies on costly and inconsistent human annotations. The authors propose a fully unsupervised, self-evolving framework that leverages Minimum Bayes Risk (MBR) decoding to guide large language models in generating high-quality pseudo-labels, combined with iterative knowledge distillation to progressively enhance model performance. Remarkably, this approach achieves state-of-the-art results without any human supervision: on the WMT Metrics Shared Task datasets, models trained solely on self-generated pseudo-labels outperform supervised baselines using human annotations at both system- and segment-level evaluations, while remaining competitive at the sentence level.

Technology Category

Application Category

📝 Abstract
Error Span Detection (ESD) is a crucial subtask in Machine Translation (MT) evaluation, aiming to identify the location and severity of translation errors. While fine-tuning models on human-annotated data improves ESD performance, acquiring such data is expensive and prone to inconsistencies among annotators. To address this, we propose a novel self-evolution framework based on Minimum Bayes Risk (MBR) decoding, named Iterative MBR Distillation for ESD, which eliminates the reliance on human annotations by leveraging an off-the-shelf LLM to generate pseudo-labels.Extensive experiments on the WMT Metrics Shared Task datasets demonstrate that models trained solely on these self-generated pseudo-labels outperform both unadapted base model and supervised baselines trained on human annotations at the system and span levels, while maintaining competitive sentence-level performance.
Problem

Research questions and friction points this paper is trying to address.

Error Span Detection
Machine Translation
Human Annotation
Annotation Inconsistency
MT Evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Error Span Detection
Minimum Bayes Risk
Iterative Distillation
Human Annotation-Free
Pseudo-Labeling
🔎 Similar Papers
B
Boxuan Lyu
Institute of Science Tokyo
H
Haiyue Song
National Institute of Information and Communications Technology, Japan
Zhi Qu
Zhi Qu
Nara Institute of Science and Technology
Machine TranslationNatural Language Processing