Post-edits Are Preferences Too

📅 2024-10-03

🏛️ Conference on Machine Translation

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Preference optimization (PO) is severely constrained by its reliance on scarce and low-quality explicit pairwise preference annotations. Method: This work proposes modeling human post-editing (PE) from machine translation as a natural, high-fidelity implicit preference signal—first rigorously establishing that PE inherently encodes strong preference structure (post-edited output ≻ original MT output) and designing a PE-aware PO framework augmented with supervised fine-tuning (SFT) warm-up to enhance training stability. Results: Without any explicit preference labels, our approach significantly improves alignment quality, generates translations closely matching human PE style, and enhances model robustness and generalization. The core contribution lies in theoretically justifying and empirically validating PE as a reliable, high-quality implicit preference source—thereby breaking PO’s dependency on explicit annotations and establishing a new paradigm for low-cost, high-credibility alignment learning.

Technology Category

Application Category

📝 Abstract

Preference Optimization (PO) techniques are currently one of the state of the art techniques for fine-tuning large language models (LLMs) on pairwise preference feedback from human annotators. However, in machine translation, this sort of feedback can be difficult to solicit. Additionally, Kreutzer et al. (2018) have shown that, for machine translation, pairwise preferences are less reliable than other forms of human feedback, such as 5-point ratings. We examine post-edits to see if they can be a source of reliable human preferences by construction. In PO, a human annotator is shown sequences $s_1$ and $s_2$ and asked for a preference judgment, %$s_1>s_2$; while for post-editing, editors create $s_1$ and know that it should be better than $s_2$. We attempt to use these implicit preferences for PO and show that it helps the model move towards post-edit-like hypotheses and away from machine translation-like hypotheses. Furthermore, we show that best results are obtained by pre-training the model with supervised fine-tuning (SFT) on post-edits in order to promote post-edit-like hypotheses to the top output ranks.

Problem

Research questions and friction points this paper is trying to address.

Exploring post-edits as reliable human preferences

Optimizing language models using implicit preference feedback

Enhancing machine translation with post-edit-like hypotheses

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses post-edits as preference indicators

Applies Preference Optimization techniques

Combines supervised fine-tuning with PO

🔎 Similar Papers

No similar papers found.