PHRASED: Phrase Dictionary Biasing for Speech Translation

📅 2025-06-10

📈 Citations: 0

✨ Influential: 0

career value

143K/year

🤖 AI Summary

In speech translation, inaccurate translation of low-frequency phrases remains a critical challenge. To address this, we propose a dictionary-based logit bias augmentation method grounded in source–target phrase pair mappings—the first approach to integrate structured phrase dictionaries into speech translation bias mechanisms, enabling cross-architecture adaptation for both streaming speech translation models and multimodal large language models (MLLMs). Our method comprises three components: (1) construction of a phrase dictionary with dynamic logit bias injection, (2) joint modeling of streaming transcription and translation, and (3) integration of external phrase knowledge into MLLMs. Experiments demonstrate a 21% relative improvement in phrase translation accuracy for streaming models and an 85% relative gain in phrase recall for MLLMs, significantly enhancing low-frequency phrase modeling. This work establishes a novel paradigm for injecting external structured knowledge into speech translation systems.

Technology Category

Application Category

📝 Abstract

Phrases are essential to understand the core concepts in conversations. However, due to their rare occurrence in training data, correct translation of phrases is challenging in speech translation tasks. In this paper, we propose a phrase dictionary biasing method to leverage pairs of phrases mapping from the source language to the target language. We apply the phrase dictionary biasing method to two types of widely adopted models, a transducer-based streaming speech translation model and a multimodal large language model. Experimental results show that the phrase dictionary biasing method outperforms phrase list biasing by 21% relatively for the streaming speech translation model. In addition, phrase dictionary biasing enables multimodal large language models to use external phrase information, achieving 85% relative improvement in phrase recall.

Problem

Research questions and friction points this paper is trying to address.

Improving rare phrase translation in speech tasks

Enhancing phrase recall using dictionary biasing

Integrating external phrase data into multimodal models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Phrase dictionary biasing for speech translation

Leveraging source-target phrase pairs

Improving phrase recall in models

🔎 Similar Papers

LM-assisted keyword biasing with Aho-Corasick algorithm for Transducer-based ASR