Anticipating Future with Large Language Model for Simultaneous Machine Translation

πŸ“… 2024-10-29
πŸ›οΈ North American Chapter of the Association for Computational Linguistics
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the fundamental trade-off between low latency and high translation quality in real-time speech translation, this paper proposes a novel synchronous machine translation paradigm leveraging large language model (LLM)-guided lookahead prediction. Unlike conventional approaches that rely solely on already-received source tokens, our method is the first to employ LLMs to predict upcoming source words and introduces a risk-aware lookahead framework (TAF) that guides incremental target generation without significantly increasing latency. By tightly integrating LLM-assisted prediction with a synchronous translation architecture, our approach achieves state-of-the-art latency-quality trade-offs across four language pairsβ€”English-to-Chinese, Japanese, Korean, and German. Under identical three-token latency constraints, it improves BLEU scores by up to 5.0 points over strong baselines. The implementation is publicly available.

Technology Category

Application Category

πŸ“ Abstract
Simultaneous machine translation (SMT) takes streaming input utterances and incrementally produces target text. Existing SMT methods mainly use the partial utterance that has already arrived at the input and the generated hypothesis. Motivated by human interpreters' technique to forecast future words before hearing them, we propose $ extbf{T}$ranslation by $ extbf{A}$nticipating $ extbf{F}$uture (TAF), a method to improve translation quality while retraining low latency. Its core idea is to use a large language model (LLM) to predict future source words and opportunistically translate without introducing too much risk. We evaluate our TAF and multiple baselines of SMT on four language directions. Experiments show that TAF achieves the best translation quality-latency trade-off and outperforms the baselines by up to 5 BLEU points at the same latency (three words). Code is released at https://github.com/owaski/TAF
Problem

Research questions and friction points this paper is trying to address.

Improving simultaneous machine translation quality with future word prediction
Reducing latency in translation using large language models
Enhancing translation accuracy without significant delay increase
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLM to predict future source words
Improves translation quality with low latency
Outperforms baselines by up to 5 BLEU
πŸ”Ž Similar Papers
No similar papers found.