Quality-Aware Decoding: Unifying Quality Estimation and Decoding

📅 2025-02-12

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

Existing quality estimation (QE) models are primarily employed for post-hoc N-best re-ranking, making real-time integration into decoding—especially in document-level translation with low-quality N-best lists—challenging. This work proposes the first token-level, unidirectional QE model embedded directly within the decoding process, enabling incremental and lightweight quality prediction while jointly optimizing translation fluency and quality estimation. Our approach comprises three core components: (1) a unidirectional QE model built upon the Transformer decoder architecture; (2) quality-weighted log-probability fusion during decoding; and (3) a dynamic, quality-guided beam search strategy. Experiments across multilingual benchmarks demonstrate substantial improvements over conventional N-best re-ranking, with up to +1.39 points in XCOMET-XXL scores. Notably, the method significantly enhances robustness in long-context and low-quality candidate scenarios, particularly for document-level translation.

Technology Category

Application Category

📝 Abstract

An emerging research direction in NMT involves the use of Quality Estimation (QE) models, which have demonstrated high correlations with human judgment and can enhance translations through Quality-Aware Decoding. Although several approaches have been proposed based on sampling multiple candidate translations, none have integrated these models directly into the decoding process. In this paper, we address this by proposing a novel token-level QE model capable of reliably scoring partial translations. We build a uni-directional QE model for this, as decoder models are inherently trained and efficient on partial sequences. We then present a decoding strategy that integrates the QE model for Quality-Aware decoding and demonstrate that the translation quality improves when compared to the N-best list re-ranking with state-of-the-art QE models (upto $1.39$ XCOMET-XXL $uparrow$). Finally, we show that our approach provides significant benefits in document translation tasks, where the quality of N-best lists is typically suboptimal.

Problem

Research questions and friction points this paper is trying to address.

Integrate QE models into decoding

Improve translation quality with QE

Enhance document translation tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Token-level QE model

Unidirectional QE integration

Quality-Aware Decoding strategy

🔎 Similar Papers

When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding