π€ AI Summary
This work addresses the limited capability of open-source large language models to autonomously resolve long-horizon, real-world software engineering tasks by proposing a systematic post-training framework. The framework integrates, for the first time, teacher trajectory synthesis, long-horizon supervised fine-tuning, execution-feedback-based reinforcement learning, and test-time scaling (TTS) into an end-to-end, reproducible agent optimization pipeline. Evaluated on the SWE-bench Verified benchmark, the approach achieves a 61.4% resolution rate using Qwen2.5-Coder-32B, which further improves to 70.8% with TTS@8. These results substantially outperform existing open-source methods, demonstrating the effectiveness and reproducibility of the proposed methodology in tackling complex software engineering challenges.
π Abstract
In this technical report, we present SWE-Master, an open-source and fully reproducible post-training framework for building effective software engineering agents. SWE-Master systematically explores the complete agent development pipeline, including teacher-trajectory synthesis and data curation, long-horizon SFT, RL with real execution feedback, and inference framework design. Starting from an open-source base model with limited initial SWE capability, SWE-Master demonstrates how systematical optimization method can elicit strong long-horizon SWE task solving abilities. We evaluate SWE-Master on SWE-bench Verified, a standard benchmark for realistic software engineering tasks. Under identical experimental settings, our approach achieves a resolve rate of 61.4\% with Qwen2.5-Coder-32B, substantially outperforming existing open-source baselines. By further incorporating test-time scaling~(TTS) with LLM-based environment feedback, SWE-Master reaches 70.8\% at TTS@8, demonstrating a strong performance potential. SWE-Master provides a practical and transparent foundation for advancing reproducible research on software engineering agents. The code is available at https://github.com/RUCAIBox/SWE-Master.