Scaling Model and Data for Multilingual Machine Translation with Open Large Language Models

📅 2026-02-12

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work aims to enhance the performance of open-source large language models in multilingual machine translation, particularly for low-resource languages. Building upon the Gemma3 architecture, the study systematically investigates the co-scaling dynamics between model size and data volume through continued pretraining and instruction tuning, resulting in MiLMMT-46—a model supporting 46 languages. This research presents the first empirical demonstration of joint scaling effects of model and data in open-source large language models for multilingual translation. MiLMMT-46 substantially outperforms existing open-source systems such as Seed-X, HY-MT-1.5, and TranslateGemma across all 46 languages, achieving performance comparable to commercial models like Google Translate and Gemini 3 Pro.

Technology Category

Application Category

📝 Abstract

Open large language models (LLMs) have demonstrated improving multilingual capabilities in recent years. In this paper, we present a study of open LLMs for multilingual machine translation (MT) across a range of languages, and investigate the effects of model scaling and data scaling when adapting open LLMs to multilingual MT through continual pretraining and instruction finetuning. Based on the Gemma3 model family, we develop MiLMMT-46, which achieves top-tier multilingual translation performance across 46 languages. Extensive experiments show that MiLMMT-46 consistently outperforms recent state-of-the-art (SOTA) models, including Seed-X, HY-MT-1.5, and TranslateGemma, and achieves competitive performance with strong proprietary systems such as Google Translate and Gemini 3 Pro.

Problem

Research questions and friction points this paper is trying to address.

multilingual machine translation

open large language models

model scaling

data scaling

Innovation

Methods, ideas, or system contributions that make the work stand out.

model scaling

data scaling

multilingual machine translation

open large language models