🤖 AI Summary
This study addresses the severe performance degradation of extremely low-resource languages in cross-domain neural machine translation due to domain shift. To mitigate this issue, the authors propose a two-stage approach that combines fine-tuned neural machine translation (NMT) with retrieval-augmented generation (RAG) using a large language model (LLM): an initial translation is first generated by NMT, then refined by the LLM leveraging context-aware retrieved examples. The work innovatively identifies the quantity of retrieved examples—not the algorithmic design—as the key driver of performance gains, and demonstrates that the LLM serves as a reliable “safety net” in zero-shot domains. Evaluated on Dhao Bible translations across old and new testaments, the method improves chrF++ scores from 27.11 to 35.21, approaching in-domain performance (36.17) with a substantial gain of +8.10.
📝 Abstract
Neural Machine Translation (NMT) models for low-resource languages suffer significant performance degradation under domain shift. We quantify this challenge using Dhao, an indigenous language of Eastern Indonesia with no digital footprint beyond the New Testament (NT). When applied to the unseen Old Testament (OT), a standard NMT model fine-tuned on the NT drops from an in-domain score of 36.17 chrF++ to 27.11 chrF++. To recover this loss, we introduce a hybrid framework where a fine-tuned NMT model generates an initial draft, which is then refined by a Large Language Model (LLM) using Retrieval-Augmented Generation (RAG). The final system achieves 35.21 chrF++ (+8.10 recovery), effectively matching the original in-domain quality. Our analysis reveals that this performance is driven primarily by the number of retrieved examples rather than the choice of retrieval algorithm. Qualitative analysis confirms the LLM acts as a robust"safety net,"repairing severe failures in zero-shot domains.