🤖 AI Summary
This work proposes K-EXAONE, a 236B-parameter multilingual large language model based on a sparsely activated Mixture-of-Experts architecture, which activates only 23B parameters during inference. Designed to meet the growing demands in industrial and scientific applications for models with strong reasoning capabilities, extended context handling, and multilingual support, K-EXAONE accommodates a context length of up to 256K tokens and supports six languages: Korean, English, Spanish, German, Japanese, and Vietnamese. Through large-scale distributed training, multilingual pretraining, and alignment techniques, the model achieves efficient scaling. Evaluations across reasoning, agent-based tasks, general capabilities, and multilingual benchmarks demonstrate that K-EXAONE matches the performance of comparable open-source models, underscoring its potential as a high-performance foundation model.
📝 Abstract
This technical report presents K-EXAONE, a large-scale multilingual language model developed by LG AI Research. K-EXAONE is built on a Mixture-of-Experts architecture with 236B total parameters, activating 23B parameters during inference. It supports a 256K-token context window and covers six languages: Korean, English, Spanish, German, Japanese, and Vietnamese. We evaluate K-EXAONE on a comprehensive benchmark suite spanning reasoning, agentic, general, Korean, and multilingual abilities. Across these evaluations, K-EXAONE demonstrates performance comparable to open-weight models of similar size. K-EXAONE, designed to advance AI for a better life, is positioned as a powerful proprietary AI foundation model for a wide range of industrial and research applications.