Beyond Fixed Tests: Repository-Level Issue Resolution as Coevolution of Code and Behavioral Constraints

📅 2026-04-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a key limitation in conventional large language model (LLM)-based program repair approaches, which treat tests as static constraints and often yield under-constrained, fragile, or overfitted patches. To overcome this, the authors propose Agent-CoEvo, a novel framework that models repository-level repair as a co-evolutionary process between code patches and test patches. By leveraging a multi-agent architecture, Agent-CoEvo dynamically refines behavioral constraints through iterative mutual evaluation and semantic recombination, enabling joint evolution of both implementation and specification. The framework is trained and evaluated end-to-end on SWE-bench Lite and SWT-bench Lite, significantly outperforming state-of-the-art agent-based and non-agent baselines in both repair success rate and test reproduction quality.
📝 Abstract
Software engineers resolving repository-level issues do not treat existing tests as immutable correctness oracles. Instead, they iteratively refine both code and the tests used to characterize intended behavior, as new modifications expose missing assumptions or misinterpreted failure conditions. In contrast, most existing large language model (LLM)-based repair systems adopt a linear pipeline in which tests or other validation signals act mostly as post-hoc filters, treating behavioral constraints as fixed during repair. This formulation reduces repair to optimizing code under static and potentially misaligned constraints, leading to under-constrained search and brittle or overfitted fixes. We argue that repository-level issue resolution is fundamentally not optimization under fixed tests, but search over evolving behavioral constraints. To operationalize this view, we propose Agent-CoEvo, a coevolutionary multi-agent framework in which candidate code patches and test patches are jointly explored and iteratively refined. Rather than treating tests as immutable oracles, our framework models them as dynamic constraints that both guide and are revised by the repair process. Through mutual evaluation and semantic recombination, code and test candidates progressively narrow the space of behavior consistent with the issue description. Evaluated on SWE-bench Lite and SWT-bench Lite, Agent-CoEvo consistently outperforms state-of-the-art agent-based and agentless baselines in both repair success and test reproduction quality. Our findings suggest that enabling repair agents to revise behavioral constraints during search is critical for reliable issue resolution, pointing toward a shift from code-only optimization to coevolution of implementation and specification.
Problem

Research questions and friction points this paper is trying to address.

repository-level issue resolution
behavioral constraints
test evolution
code repair
coevolution
Innovation

Methods, ideas, or system contributions that make the work stand out.

coevolution
multi-agent framework
test evolution
behavioral constraints
repository-level repair
🔎 Similar Papers
No similar papers found.
K
Kefan Li
Beihang University, China and Beijing Tokfinity Technology Co., Ltd., China
Yuan Yuan
Yuan Yuan
Professor, Northwestern Polytechnical University, China
Computer vision、Image processing、Machine learning、Multimedia、Pattern recognition
M
Mengfei Wang
Beijing Tokfinity Technology Co., Ltd., China
S
Shihao Zheng
Beijing Tokfinity Technology Co., Ltd., China
W
Wei Wang
Beijing Tokfinity Technology Co., Ltd., China
P
Ping Yang
Beijing Tokfinity Technology Co., Ltd., China
M
Mu Li
Beihang University, China
W
Weifeng Lv
Beihang University, China