K-EXAONE Technical Report

📅 2026-01-05

📈 Citations: 1

✨ Influential: 0

career value

187K/year

🤖 AI Summary

This work proposes K-EXAONE, a 236B-parameter multilingual large language model based on a sparsely activated Mixture-of-Experts architecture, which activates only 23B parameters during inference. Designed to meet the growing demands in industrial and scientific applications for models with strong reasoning capabilities, extended context handling, and multilingual support, K-EXAONE accommodates a context length of up to 256K tokens and supports six languages: Korean, English, Spanish, German, Japanese, and Vietnamese. Through large-scale distributed training, multilingual pretraining, and alignment techniques, the model achieves efficient scaling. Evaluations across reasoning, agent-based tasks, general capabilities, and multilingual benchmarks demonstrate that K-EXAONE matches the performance of comparable open-source models, underscoring its potential as a high-performance foundation model.

Technology Category

Application Category

📝 Abstract

This technical report presents K-EXAONE, a large-scale multilingual language model developed by LG AI Research. K-EXAONE is built on a Mixture-of-Experts architecture with 236B total parameters, activating 23B parameters during inference. It supports a 256K-token context window and covers six languages: Korean, English, Spanish, German, Japanese, and Vietnamese. We evaluate K-EXAONE on a comprehensive benchmark suite spanning reasoning, agentic, general, Korean, and multilingual abilities. Across these evaluations, K-EXAONE demonstrates performance comparable to open-weight models of similar size. K-EXAONE, designed to advance AI for a better life, is positioned as a powerful proprietary AI foundation model for a wide range of industrial and research applications.

Problem

Research questions and friction points this paper is trying to address.

multilingual language model

large-scale AI

Mixture-of-Experts

long context window

foundation model

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts

multilingual language model

long-context window