Multilingual LLM Prompting Strategies for Medical English-Vietnamese Machine Translation

📅 2025-09-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenges of terminology inconsistency and poor domain adaptation in English–Vietnamese medical machine translation (En-Vi MT), where Vietnamese is a low-resource language. We propose a terminology-aware prompting framework tailored for multilingual large language models (LLMs). Our method integrates Meddict—a specialized medical dictionary—to construct terminology-enhanced prompts and incorporates embedding-based dynamic example retrieval, supporting zero-shot, few-shot, and dictionary-augmented prompting paradigms. Experiments demonstrate substantial improvements in translation accuracy and terminology consistency, outperforming baseline models by +4.2–7.8 BLEU across multiple medical test sets, with particularly strong gains in zero-shot settings. To our knowledge, this is the first systematic validation of the synergistic effectiveness of terminology guidance and embedding-based retrieval for optimizing low-resource medical translation. The framework provides a reusable technical pathway for professional translation in resource-scarce languages.

Technology Category

Application Category

📝 Abstract
Medical English-Vietnamese machine translation (En-Vi MT) is essential for healthcare access and communication in Vietnam, yet Vietnamese remains a low-resource and under-studied language. We systematically evaluate prompting strategies for six multilingual LLMs (0.5B-9B parameters) on the MedEV dataset, comparing zero-shot, few-shot, and dictionary-augmented prompting with Meddict, an English-Vietnamese medical lexicon. Results show that model scale is the primary driver of performance: larger LLMs achieve strong zero-shot results, while few-shot prompting yields only marginal improvements. In contrast, terminology-aware cues and embedding-based example retrieval consistently improve domain-specific translation. These findings underscore both the promise and the current limitations of multilingual LLMs for medical En-Vi MT.
Problem

Research questions and friction points this paper is trying to address.

Evaluating multilingual LLM prompting for medical English-Vietnamese translation
Addressing low-resource language challenges in healthcare communication
Improving domain-specific translation through terminology-aware strategies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual LLM prompting strategies
Dictionary-augmented terminology-aware cues
Embedding-based example retrieval method
🔎 Similar Papers
No similar papers found.
N
Nhu Vo
College of Engineering and Computer Science, VinUniversity, Vietnam
N
Nu-Uyen-Phuong Le
University of Queensland, Australia
D
Dung D. Le
College of Engineering and Computer Science, VinUniversity, Vietnam
Massimo Piccardi
Massimo Piccardi
Professor, University of Technology Sydney
natural language processingcomputer visionpattern recognition
Wray Buntine
Wray Buntine
Professor, VinUniversity
Machine Learning