🤖 AI Summary
This work addresses the challenge of significantly enhancing the performance of code generation models on agent-oriented tasks under extremely low active parameter counts—specifically, only 3 billion active parameters. The authors propose a novel agent training paradigm grounded in environmental feedback, which leverages large-scale synthetically generated, verifiable programming tasks paired with executable environments. By integrating in-training feedback with reinforcement learning, they efficiently train an 80-billion-parameter sparse model. The resulting model achieves performance on agent-centric benchmarks such as SWE-Bench and Terminal-Bench that rivals substantially larger models. To advance research and applications in coding agents, the authors release both base and instruction-tuned versions of their model under an open-source license.
📝 Abstract
We present Qwen3-Coder-Next, an open-weight language model specialized for coding agents. Qwen3-Coder-Next is an 80-billion-parameter model that activates only 3 billion parameters during inference, enabling strong coding capability with efficient inference. In this work, we explore how far strong training recipes can push the capability limits of models with small parameter footprints. To achieve this, we perform agentic training through large-scale synthesis of verifiable coding tasks paired with executable environments, allowing learning directly from environment feedback via mid-training and reinforcement learning. Across agent-centric benchmarks including SWE-Bench and Terminal-Bench, Qwen3-Coder-Next achieves competitive performance relative to its active parameter count. We release both base and instruction-tuned open-weight versions to support research and real-world coding agent development.