๐ค AI Summary
This study addresses the marginalization of African languages in scientific communication, which impedes access to and production of scientific knowledge for hundreds of millions of speakers due to the absence of standardized scientific terminology. To bridge this gap, the authors present AfriScience-MT, the first systematically constructed parallel corpus spanning six African languages and eleven scientific domains, developed through collaboration between professional translators and science communication experts who translated paper abstracts and coined missing technical terms. The resource enables machine translation evaluation under zero-shot, few-shot, and fine-tuned settings. Evaluations reveal that closed-source models (GPT-5.4 and Gemini) achieve state-of-the-art performance (sentence-level COMET scores of 68.3 and 68.0, respectively), while fine-tuned open-source models (NLLB-1.3B and TranslateGemma-12B) also demonstrate strong results. This work advances the decolonization and localization of scientific knowledge and fills a critical void in African-language scientific translation resources.
๐ Abstract
The dominance of colonial languages in African education and scientific communication limits how hundreds of millions of speakers of African languages access and produce scientific knowledge. A core obstacle is the lack of established scientific terminology in these languages. We introduce AfriScience-MT, a parallel corpus covering six African languages (Amharic, Hausa, Luganda, Northern Sotho, Yorรนbรก, and isiZulu) across 11 scientific domains. Professional translators, working with expert science communicators, translated plain-language summaries of scientific papers into each target language and created new terms where none existed. We benchmark machine translation systems and large language models in zero-shot, few-shot, and fine-tuned settings. Our results show that closed-source models outperform all open-source models at both the sentence and document levels: GPT-5.4 and Gemini-3.1-Flash-Lite lead with average sentence-level COMET scores of 68.3 and 68.0, respectively, and tie at an average document-level COMET of 48.3. Among open systems, fine-tuned NLLB-1.3B reaches 67.3 at the sentence level, and TranslateGemma-12B reaches 44.0 at the document level with 1-shot in-context learning. We release AfriScience-MT to support benchmarking and document-level scientific MT for African languages.