Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

📅 2024-12-19

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This study investigates the motivations for developing localized large language models (LLMs), the origins of language-specific capabilities—particularly Japanese—and the principles governing cross-lingual capability transfer. Method: We conduct a systematic evaluation of 35 Japanese–English and multilingual LLMs across 19 Japanese–English benchmarks, employing multi-benchmark assessment, correlation analysis, and principal component analysis (PCA). Contribution/Results: We formally define and empirically validate the concept of “Japanese-language capability” for the first time. We find that Japanese knowledge QA and EN↔JA translation strongly depend on Japanese textual training and follow Japanese-specific compute-scaling laws; in contrast, coding and arithmetic abilities exhibit robust cross-lingual transfer. We identify two transferable capability clusters and two Japanese-specific capability clusters. Furthermore, English pretraining significantly enhances performance on Japanese academic reasoning (JMMLU). These findings provide both theoretical foundations and practical guidelines for multilingual LLM localization.

Technology Category

Application Category

📝 Abstract

Why do we build local large language models (LLMs)? What should a local LLM learn from the target language? Which abilities can be transferred from other languages? Do language-specific scaling laws exist? To explore these research questions, we evaluated 35 Japanese, English, and multilingual LLMs on 19 evaluation benchmarks for Japanese and English, taking Japanese as a local language. Adopting an observational approach, we analyzed correlations of benchmark scores, and conducted principal component analysis (PCA) on the scores to derive extit{ability factors} of local LLMs. We found that training on English text can improve the scores of academic subjects in Japanese (JMMLU). In addition, it is unnecessary to specifically train on Japanese text to enhance abilities for solving Japanese code generation, arithmetic reasoning, commonsense, and reading comprehension tasks. In contrast, training on Japanese text could improve question-answering tasks about Japanese knowledge and English-Japanese translation, which indicates that abilities for solving these two tasks can be regarded as extit{Japanese abilities} for LLMs. Furthermore, we confirmed that the Japanese abilities scale with the computational budget for Japanese text.

Problem

Research questions and friction points this paper is trying to address.

Investigating why local LLMs are built and their language-specific capabilities

Analyzing transferability of abilities across languages in multilingual LLMs

Examining language-specific scaling laws for Japanese language abilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluated multilingual LLMs using observational analysis

Applied PCA to derive ability factors from benchmarks

Identified language-specific scaling laws for Japanese abilities

🔎 Similar Papers

Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models