Qwen3 Technical Report

📅 2025-05-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the fundamental trade-offs among performance, efficiency, multilingual capability, and dynamic reasoning in large language models, this work introduces the Qwen3 series (0.6B–235B), featuring a novel unified “thinking/non-thinking” dual-mode architecture with user-controllable inference budgeting—enabling runtime adaptive mode switching. We propose a synergistic design integrating Mixture-of-Experts (MoE) and dense architectures, augmented by chain-of-thought knowledge distillation and multilingual joint pretraining, expanding language coverage from 29 to 119 languages while substantially reducing training costs for smaller variants. Qwen3 achieves state-of-the-art results on code, mathematical reasoning, and agent-oriented benchmarks—matching or exceeding the performance of significantly larger MoE models and leading proprietary models. All model weights are released under the Apache 2.0 license, ensuring full openness and reproducibility.

Technology Category

Application Category

📝 Abstract
In this work, we present Qwen3, the latest version of the Qwen model family. Qwen3 comprises a series of large language models (LLMs) designed to advance performance, efficiency, and multilingual capabilities. The Qwen3 series includes models of both dense and Mixture-of-Expert (MoE) architectures, with parameter scales ranging from 0.6 to 235 billion. A key innovation in Qwen3 is the integration of thinking mode (for complex, multi-step reasoning) and non-thinking mode (for rapid, context-driven responses) into a unified framework. This eliminates the need to switch between different models--such as chat-optimized models (e.g., GPT-4o) and dedicated reasoning models (e.g., QwQ-32B)--and enables dynamic mode switching based on user queries or chat templates. Meanwhile, Qwen3 introduces a thinking budget mechanism, allowing users to allocate computational resources adaptively during inference, thereby balancing latency and performance based on task complexity. Moreover, by leveraging the knowledge from the flagship models, we significantly reduce the computational resources required to build smaller-scale models, while ensuring their highly competitive performance. Empirical evaluations demonstrate that Qwen3 achieves state-of-the-art results across diverse benchmarks, including tasks in code generation, mathematical reasoning, agent tasks, etc., competitive against larger MoE models and proprietary models. Compared to its predecessor Qwen2.5, Qwen3 expands multilingual support from 29 to 119 languages and dialects, enhancing global accessibility through improved cross-lingual understanding and generation capabilities. To facilitate reproducibility and community-driven research and development, all Qwen3 models are publicly accessible under Apache 2.0.
Problem

Research questions and friction points this paper is trying to address.

Advancing performance, efficiency, and multilingual capabilities in LLMs
Integrating thinking and non-thinking modes for dynamic reasoning
Reducing computational resources while maintaining competitive performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified thinking and non-thinking modes framework
Thinking budget mechanism for adaptive computation
Knowledge transfer to optimize smaller models
🔎 Similar Papers
No similar papers found.
An Yang
An Yang
Qwen Team, Peking University
Nature Language Processing (NLP)
A
Anfeng Li
Baosong Yang
Baosong Yang
Alibaba-inc
Machine LearningLarge Language ModelMachine Translation
B
Beichen Zhang
Binyuan Hui
Binyuan Hui
Qwen Team, Alibaba Group
Large Language ModelsCodeLLMsReasoningAgent
B
Bo Zheng
Bowen Yu
Bowen Yu
Qwen Team, Alibaba Group
Post-trainingFoundation Model
C
Chang Gao
C
Chengen Huang
C
Chenxu Lv
Chujie Zheng
Chujie Zheng
Qwen Team, Alibaba Group
Artifical IntelligenceLarge Language Models
D
Dayiheng Liu
F
Fan Zhou
F
Fei Huang
F
Feng Hu
H
Hao Ge
H
Haoran Wei
H
Huan Lin
Jialong Tang
Jialong Tang
Qwen Team, Alibaba
LLMNLP
J
Jian Yang
J
Jianhong Tu
J
Jianwei Zhang
J
Jianxin Yang
Jiaxi Yang
Jiaxi Yang
PhD student, SIAT, CAS, China
Natural Language ProcessingLarge Language Model
J
Jing Zhou
Jingren Zhou
Jingren Zhou
Alibaba Group, Microsoft
Cloud ComputingLarge Scale Distributed SystemsMachine LearningQuery ProcessingQuery
Junyang Lin
Junyang Lin
Qwen Team, Alibaba Group & Peking University
Natural Language ProcessingCross-Modal Representation LearningPretraining
K
Kai Dang
Keqin Bao
Keqin Bao
University of Science and Technology of China
Large Language ModelsRecommender Systems
K
Kexin Yang
L
Le Yu
L
Lianghao Deng
M
Mei Li
M
Min Xue
M
Mingze Li
P
Pei Zhang
P
Peng Wang
Q
Qin Zhu
Rui Men
Rui Men
Qwen Team, Alibaba Group & Peking University
NLP
R
Ruize Gao
Shixuan Liu
Shixuan Liu
National University of Defense Technology
Knowledge ReasoningDomain GeneralizationCausal InferenceData Engineering
S
Shuang Luo
T
Tianhao Li
Tianyi Tang
Tianyi Tang
Qwen Team, Alibaba Group & Renmin University of China
Artificial IntelligenceNatural Language Processing
Wenbiao Yin
Wenbiao Yin
Tongyi Lab, Alibaba Group
LLMAgentRAG
X
Xingzhang Ren
X
Xinyu Wang
X
Xinyu Zhang
X
Xuancheng Ren
Yang Fan
Yang Fan
University of Science and Technology of China
Learning to TeachAutomated Machine LearningNeural Architecture SearchNatural Language ProcessingAI for Medicine
Yang Su
Yang Su
King's College London
Yichang Zhang
Yichang Zhang
Qwen Team, Alibaba Group
NLPReinforcement LearningDeep LearningMachine LearningArtificial Intelligence
Y
Yinger Zhang
Y
Yu Wan
Y
Yuqiong Liu
Z
Zekun Wang
Zeyu Cui
Zeyu Cui
Institute of Automation, Chinese Academy of Sciences
Code GenerationLLMRecommendation System
Zhenru Zhang
Zhenru Zhang
Qwen Team, Alibaba Group
Large Language Model
Z
Zhipeng Zhou
Zihan Qiu
Zihan Qiu
Qwen Team, Alibaba Group & IIIS, Tsinghua University
Mixture of ExpertsModular Deep LearningInterpretability