Expanding Foundational Language Capabilities in Open-Source LLMs through a Korean Case Study

📅 2025-09-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing open-source large language models (LLMs) exhibit suboptimal Korean language capabilities without compromising English performance. Method: We propose Llama-3-Motif—a 102B-parameter model built upon the Llama 3 architecture—that integrates LlamaPro’s structured expansion with Masked Structure Growth (MSG), enabling efficient, scalable parameter growth without altering the core Transformer architecture. Training was conducted on the MoAI ultra-large-scale GPU cluster, with fine-grained bilingual data ratio tuning to balance English and Korean supervision. Contribution/Results: Llama-3-Motif achieves state-of-the-art Korean performance on major Korean benchmarks—surpassing all prior open-source models and approaching GPT-4—while retaining SOTA English capabilities. This work represents the first demonstration of high-fidelity, balanced bilingual (English–Korean) enhancement in a 100B-scale open-source LLM, establishing a novel paradigm for lightweight, efficient multilingual model scaling.

Technology Category

Application Category

📝 Abstract
We introduce Llama-3-Motif, a language model consisting of 102 billion parameters, specifically designed to enhance Korean capabilities while retaining strong performance in English. Developed on the Llama 3 architecture, Llama-3-Motif employs advanced training techniques, including LlamaPro and Masked Structure Growth, to effectively scale the model without altering its core Transformer architecture. Using the MoAI platform for efficient training across hyperscale GPU clusters, we optimized Llama-3-Motif using a carefully curated dataset that maintains a balanced ratio of Korean and English data. Llama-3-Motif shows decent performance on Korean-specific benchmarks, outperforming existing models and achieving results comparable to GPT-4.
Problem

Research questions and friction points this paper is trying to address.

Enhancing Korean language capabilities in open-source LLMs
Scaling model size without altering Transformer architecture
Balancing Korean and English performance in training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced Korean-English bilingual model with 102B parameters
Advanced training techniques: LlamaPro and Masked Structure Growth
Optimized using MoAI platform with balanced dataset
🔎 Similar Papers
No similar papers found.
J
Junghwan Lim
G
Gangwon Jo
Sungmin Lee
Sungmin Lee
AIX, SK Telecom
Machine LearningComputer Vision
J
Jiyoung Park
D
Dongseok Kim
J
Jihwan Kim
Junhyeok Lee
Junhyeok Lee
Johns Hopkins University, Center for Language and Signal Processing
Speech and Language ProcessingSpeech ProcessingSpeech SynthesisGenerative Model
Wai Ting Cheung
Wai Ting Cheung
Dahye Choi
Dahye Choi
K
Kibong Choi
J
Jaeyeon Huh
B
Beomgyu Kim
Jangwoong Kim
Jangwoong Kim
T
Taehyun Kim
H
Haesol Lee
J
Jeesoo Lee
D
Dongpin Oh
C
Changseok Song
D
Daewon Suh