SWE-Universe: Scale Real-World Verifiable Environments to Millions

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of low yield, weak verifiability, and high cost in automatically constructing realistic, verifiable software engineering (SWE) environments. We propose a scalable framework that parses GitHub pull requests to automatically build high-quality, multilingual SWE environments and introduces a build agent capable of iterative self-verification and in-loop vulnerability detection. For the first time, our approach efficiently constructs 807,693 real-world verifiable SWE environments, enabling large-scale agent training and reinforcement learning. Evaluated on SWE-Bench Verified, our method achieves a 75.3% resolution rate with Qwen3-Max-Thinking, significantly advancing the large-scale generation and practical application of verifiable SWE environments.

Technology Category

Application Category

📝 Abstract
We propose SWE-Universe, a scalable and efficient framework for automatically constructing real-world software engineering (SWE) verifiable environments from GitHub pull requests (PRs). To overcome the prevalent challenges of automatic building, such as low production yield, weak verifiers, and prohibitive cost, our framework utilizes a building agent powered by an efficient custom-trained model. This agent employs iterative self-verification and in-loop hacking detection to ensure the reliable generation of high-fidelity, verifiable tasks. Using this method, we scale the number of real-world multilingual SWE environments to a million scale (807,693). We demonstrate the profound value of our environments through large-scale agentic mid-training and reinforcement learning. Finally, we applied this technique to Qwen3-Max-Thinking and achieved a score of 75.3% on SWE-Bench Verified. Our work provides both a critical resource and a robust methodology to advance the next generation of coding agents.
Problem

Research questions and friction points this paper is trying to address.

software engineering
verifiable environments
scalability
GitHub pull requests
automatic building
Innovation

Methods, ideas, or system contributions that make the work stand out.

SWE-Universe
verifiable environments
building agent
iterative self-verification
in-loop hacking detection