Anticipating Future with Large Language Model for Simultaneous Machine Translation

📅 2024-10-29

🏛️ North American Chapter of the Association for Computational Linguistics

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

To address the fundamental trade-off between low latency and high translation quality in real-time speech translation, this paper proposes a novel synchronous machine translation paradigm leveraging large language model (LLM)-guided lookahead prediction. Unlike conventional approaches that rely solely on already-received source tokens, our method is the first to employ LLMs to predict upcoming source words and introduces a risk-aware lookahead framework (TAF) that guides incremental target generation without significantly increasing latency. By tightly integrating LLM-assisted prediction with a synchronous translation architecture, our approach achieves state-of-the-art latency-quality trade-offs across four language pairs—English-to-Chinese, Japanese, Korean, and German. Under identical three-token latency constraints, it improves BLEU scores by up to 5.0 points over strong baselines. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

Simultaneous machine translation (SMT) takes streaming input utterances and incrementally produces target text. Existing SMT methods mainly use the partial utterance that has already arrived at the input and the generated hypothesis. Motivated by human interpreters' technique to forecast future words before hearing them, we propose $ extbf{T}$ranslation by $ extbf{A}$nticipating $ extbf{F}$uture (TAF), a method to improve translation quality while retraining low latency. Its core idea is to use a large language model (LLM) to predict future source words and opportunistically translate without introducing too much risk. We evaluate our TAF and multiple baselines of SMT on four language directions. Experiments show that TAF achieves the best translation quality-latency trade-off and outperforms the baselines by up to 5 BLEU points at the same latency (three words). Code is released at https://github.com/owaski/TAF

Problem

Research questions and friction points this paper is trying to address.

Improving simultaneous machine translation quality with future word prediction

Reducing latency in translation using large language models

Enhancing translation accuracy without significant delay increase

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLM to predict future source words

Improves translation quality with low latency

Outperforms baselines by up to 5 BLEU

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Natural Language Processing Researcher

Kitware

Remote, USA: AL, AZ, CO, DC, FL, GA, IL, IN, MA, MD, ME, MN, NC, NM, NY, OH, OR, PA, TN, TX, UT, VA, WI

Authors to Follow