Klear-AgentForge: Forging Agentic Intelligence through Posttraining Scaling

📅 2025-11-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Open-source communities have long lacked a complete, high-performance post-training framework for agent models. This work introduces the first fully open-source agent training pipeline, built upon the Qwen3-8B foundation model and supporting tool invocation and environment interaction. Methodologically, it establishes an end-to-end training workflow—from synthetic-data-based supervised fine-tuning (SFT) to multi-turn reinforcement learning (RL)—systematically addressing core challenges in agent behavior modeling, multi-step reasoning, and feedback alignment. Experimental results demonstrate that the resulting agent significantly outperforms same-parameter open-source agents on benchmarks including ToolBench (tool usage) and CodeAgent (code generation), achieving performance competitive with larger proprietary agents. This work fills a critical gap in open-source agent training methodology, providing a standardized, reproducible, and scalable infrastructure for future agent research and development.

Technology Category

Application Category

📝 Abstract
Despite the proliferation of powerful agentic models, the lack of critical post-training details hinders the development of strong counterparts in the open-source community. In this study, we present a comprehensive and fully open-source pipeline for training a high-performance agentic model for interacting with external tools and environments, named Klear-Qwen3-AgentForge, starting from the Qwen3-8B base model. We design effective supervised fine-tuning (SFT) with synthetic data followed by multi-turn reinforcement learning (RL) to unlock the potential for multiple diverse agentic tasks. We perform exclusive experiments on various agentic benchmarks in both tool use and coding domains. Klear-Qwen3-AgentForge-8B achieves state-of-the-art performance among LLMs of similar size and remains competitive with significantly larger models.
Problem

Research questions and friction points this paper is trying to address.

Develops open-source pipeline for training agentic models with external tools
Addresses lack of post-training details hindering agentic model development
Enables high-performance agent interaction with tools and environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses supervised fine-tuning with synthetic data
Applies multi-turn reinforcement learning for agents
Trains from Qwen3-8B base model for tool interaction
🔎 Similar Papers
No similar papers found.