A.X K1 Technical Report

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work proposes A.X K1, a 519-billion-parameter mixture-of-experts (MoE) language model trained from scratch under constrained computational budgets, designed to simultaneously enhance multilingual—particularly Korean—reasoning capabilities and inference efficiency. Leveraging a 10-trillion-token corpus, multi-stage data curation, scaling-law-informed training configurations, and an innovative Think-Fusion training strategy, the model enables users to explicitly control reasoning mode switching. Experimental results demonstrate that A.X K1 achieves state-of-the-art performance among open-source models across multiple benchmarks, significantly outperforming existing approaches—especially on Korean-language tasks—while maintaining high inference efficiency and deployment flexibility.

Technology Category

Application Category

📝 Abstract

We introduce A.X K1, a 519B-parameter Mixture-of-Experts (MoE) language model trained from scratch. Our design leverages scaling laws to optimize training configurations and vocabulary size under fixed computational budgets. A.X K1 is pre-trained on a corpus of approximately 10T tokens, curated by a multi-stage data processing pipeline. Designed to bridge the gap between reasoning capability and inference efficiency, A.X K1 supports explicitly controllable reasoning to facilitate scalable deployment across diverse real-world scenarios. We propose a simple yet effective Think-Fusion training recipe, enabling user-controlled switching between thinking and non-thinking modes within a single unified model. Extensive evaluations demonstrate that A.X K1 achieves performance competitive with leading open-source models, while establishing a distinctive advantage in Korean-language benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Mixture-of-Experts

reasoning capability

inference efficiency

controllable reasoning

large language model

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts

controllable reasoning

Think-Fusion training