🤖 AI Summary
This study investigates the impact of multi-source input strategies on English/Chinese-to-Portuguese machine translation quality. We propose a context-enhanced paradigm leveraging a high-resource pivot language (e.g., English), wherein pivot translations are injected as prompts into both large language models (GPT-4o) and multilingual neural machine translation (NMT) systems (Transformer-based). Methodologically, we integrate multi-source encoding with shallow fusion techniques and systematically evaluate how language distance and resource availability modulate contextual gains. Key contributions include: (1) the first empirical validation in NMT that using a high-resource pivot language yields substantial improvements (+4.2 BLEU on a domain-specific dataset); (2) identification of language-distance sensitivity and resource dependency in contextual gain; and (3) demonstration that the strategy outperforms single-source baselines for Chinese→Portuguese translation, while gains diminish on general benchmarks—highlighting its particular efficacy in low-resource, typologically distant language pairs.
📝 Abstract
We explore the impact of multi-source input strategies on machine translation (MT) quality, comparing GPT-4o, a large language model (LLM), with a traditional multilingual neural machine translation (NMT) system. Using intermediate language translations as contextual cues, we evaluate their effectiveness in enhancing English and Chinese translations into Portuguese. Results suggest that contextual information significantly improves translation quality for domain-specific datasets and potentially for linguistically distant language pairs, with diminishing returns observed in benchmarks with high linguistic variability. Additionally, we demonstrate that shallow fusion, a multi-source approach we apply within the NMT system, shows improved results when using high-resource languages as context for other translation pairs, highlighting the importance of strategic context language selection.