π€ AI Summary
This work addresses the limited generalization of small language models in privacy-sensitive and resource-constrained settings, where they struggle to adapt to unfamiliar codebases due to weak out-of-distribution reasoning capabilities. To overcome this, we propose Repository-Centric Learning (RCL), a novel paradigm that shifts from traditional task-centric training to deeply internalizing the βphysical lawsβ of a single code repository, thereby constructing lightweight, repository-specialized expert models. We introduce a four-component RCL training framework that transforms static code repositories into interactive learning signals, yielding the SWE-Spot-4B model series. These models outperform larger open-source counterparts such as Qwen3-Coder-30B across multiple software engineering benchmarks, match the performance of efficient commercial models like GPT-4.1-mini, and achieve significantly higher sample efficiency and lower inference costs.
π Abstract
The deployment of coding agents in privacy-sensitive and resource-constrained environments drives the demand for capable open-weight Small Language Models (SLMs). However, they suffer from a fundamental capability gap: unlike frontier large models, they lack the inference-time strong generalization to work with complicated, unfamiliar codebases. We identify that the prevailing Task-Centric Learning (TCL) paradigm, which scales exposure across disparate repositories, fails to address this limitation. In response, we propose Repository-Centric Learning (RCL), a paradigm shift that prioritizes vertical repository depth over horizontal task breadth, suggesting SLMs must internalize the"physics"of a target software environment through parametric knowledge acquisition, rather than attempting to recover it via costly inference-time search. Following this new paradigm, we design a four-unit Repository-Centric Experience, transforming static codebases into interactive learning signals, to train SWE-Spot-4B, a family of highly compact models built as repo-specialized experts that breaks established scaling trends, outperforming open-weight models up to larger (e.g., CWM by Meta, Qwen3-Coder-30B) and surpassing/matching efficiency-focused commercial models (e.g., GPT-4.1-mini, GPT-5-nano) across multiple SWE tasks. Further analysis reveals that RCL yields higher training sample efficiency and lower inference costs, emphasizing that for building efficient intelligence, repository mastery is a distinct and necessary dimension that complements general coding capability.