🤖 AI Summary
This work addresses the limited adaptability and low efficiency of SWE-Agents in software engineering tasks. We propose leveraging the Agentless training paradigm as a skill prior construction mechanism: a single-round supervised fine-tuning (SFT) is performed on 5K publicly available code trajectories to model structured capabilities—including code localization, editing, and self-reflection—yielding transferable skill representations. To our knowledge, this is the first approach to bridge the gap between agent-free training and multi-turn agent frameworks, significantly enhancing both generalization and reasoning efficiency of SWE-Agents. On the SWE-bench Verified benchmark, the base model Kimi-Dev achieves 60.4% accuracy, enabling the resulting SWE-Agent to attain 48.6% pass@1—comparable to the closed-source Claude 3.5 Sonnet. Our method establishes a reusable, open-weight paradigm for skill transfer in code intelligence agents.
📝 Abstract
Large Language Models (LLMs) are increasingly applied to software engineering (SWE), with SWE-bench as a key benchmark. Solutions are split into SWE-Agent frameworks with multi-turn interactions and workflow-based Agentless methods with single-turn verifiable steps. We argue these paradigms are not mutually exclusive: reasoning-intensive Agentless training induces skill priors, including localization, code edit, and self-reflection that enable efficient and effective SWE-Agent adaptation. In this work, we first curate the Agentless training recipe and present Kimi-Dev, an open-source SWE LLM achieving 60.4% on SWE-bench Verified, the best among workflow approaches. With additional SFT adaptation on 5k publicly-available trajectories, Kimi-Dev powers SWE-Agents to 48.6% pass@1, on par with that of Claude 3.5 Sonnet (241022 version). These results show that structured skill priors from Agentless training can bridge workflow and agentic frameworks for transferable coding agents.