Echo: Graph-Enhanced Retrieval and Execution Feedback for Issue Reproduction Test Generation

📅 2026-03-07

📈 Citations: 0

✨ Influential: 0

career value

137K/year

🤖 AI Summary

This work addresses the challenge of software defect root cause localization, which is often hindered by the absence of reproducible test cases in bug reports and the high cost of manual test case creation. To overcome this, we propose Echo, an intelligent agent that uniquely integrates automated test execution with a patch validation feedback loop. Echo leverages a code knowledge graph to enhance contextual retrieval, employs automated query refinement, performs just-in-time execution validation, and applies a fail-to-pass criterion to efficiently generate high-quality single-test cases for defect reproduction. Evaluated on the SWT-Bench Verified benchmark, Echo achieves a state-of-the-art success rate of 66.28% among open-source methods, substantially improving both reproducibility efficiency and cost-effectiveness.

Technology Category

Application Category

📝 Abstract

Identifying the root cause of a bug remains difficult for many developers because bug reports often lack a bug reproducing test case that reliably triggers the failure. Manually writing such test cases is time-consuming and requires substantial effort to understand the codebase and isolate the failing behavior. To address this challenge, we propose Echo, an agent for generating issue reproducing test cases, which advances previous work in several ways. During generation, Echo strengthens context retrieval by leveraging a code graph and a novel automatic query-refinement strategy. Echo also improves upon previous tools by automatically executing generated test cases, a first-of-its-kind feature that seamlessly integrates into practical development workflows. In addition, Echo generates potential patches and uses the patched version to validate whether a candidate test meets the fail-to-pass criterion and to provide actionable feedback for refinement. Unlike prior bug-reproduction agents that sample and rank multiple candidate tests, Echo generates a single test per issue, offering a better cost-performance trade-off. Experiments on SWT-Bench Verified show that Echo establishes a new state of the art among open-source approaches, achieving a 66.28% success rate.

Problem

Research questions and friction points this paper is trying to address.

bug reproduction

test case generation

issue reproduction

software debugging

failure triggering

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph-enhanced retrieval

execution feedback

test generation