Expanding Foundational Language Capabilities in Open-Source LLMs through a Korean Case Study

📅 2025-09-04

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Existing open-source large language models (LLMs) exhibit suboptimal Korean language capabilities without compromising English performance. Method: We propose Llama-3-Motif—a 102B-parameter model built upon the Llama 3 architecture—that integrates LlamaPro’s structured expansion with Masked Structure Growth (MSG), enabling efficient, scalable parameter growth without altering the core Transformer architecture. Training was conducted on the MoAI ultra-large-scale GPU cluster, with fine-grained bilingual data ratio tuning to balance English and Korean supervision. Contribution/Results: Llama-3-Motif achieves state-of-the-art Korean performance on major Korean benchmarks—surpassing all prior open-source models and approaching GPT-4—while retaining SOTA English capabilities. This work represents the first demonstration of high-fidelity, balanced bilingual (English–Korean) enhancement in a 100B-scale open-source LLM, establishing a novel paradigm for lightweight, efficient multilingual model scaling.

Technology Category

Application Category

📝 Abstract

We introduce Llama-3-Motif, a language model consisting of 102 billion parameters, specifically designed to enhance Korean capabilities while retaining strong performance in English. Developed on the Llama 3 architecture, Llama-3-Motif employs advanced training techniques, including LlamaPro and Masked Structure Growth, to effectively scale the model without altering its core Transformer architecture. Using the MoAI platform for efficient training across hyperscale GPU clusters, we optimized Llama-3-Motif using a carefully curated dataset that maintains a balanced ratio of Korean and English data. Llama-3-Motif shows decent performance on Korean-specific benchmarks, outperforming existing models and achieving results comparable to GPT-4.

Problem

Research questions and friction points this paper is trying to address.

Enhancing Korean language capabilities in open-source LLMs

Scaling model size without altering Transformer architecture

Balancing Korean and English performance in training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced Korean-English bilingual model with 102B parameters

Advanced training techniques: LlamaPro and Masked Structure Growth

Optimized using MoAI platform with balanced dataset

🔎 Similar Papers

No similar papers found.