π€ AI Summary
Current large language model (LLM) agents struggle to effectively leverage execution evidence for error localization, patch generation, and validation in program repair. This work proposes EviACT, a novel framework that systematically integrates execution evidence throughout the entire repair pipeline. EviACT establishes a closed-loop βevidence-to-actionβ pathway through three evidence-driven constraints: retrieval scaffolding, compilation gating, and test-driven gating. Evaluated on four benchmarks, the approach outperforms the strongest baseline by 1.6β6.0 percentage points in repair success rate while reducing API invocation costs by 70.1%β88.6%, substantially enhancing both the effectiveness and efficiency of automated program repair.
π Abstract
LLM-based agents have moved automated program repair (APR) from fixed-context patch generation to interactive repository-level repair. However, existing agentic APR systems still struggle to use execution evidence to guide localization, patch generation, and validation. We propose EviACT (Evidence-to-Action), an agentic APR framework that coordinates three evidence-driven guardrails across repair stages. The retrieval scaffold grounds repair context, the compile gate filters invalid edits, and the test-driven gate checks target-test recovery before full regression. Across four benchmarks, EviACT improves resolve rate over the strongest reported comparable baselines by 1.6-6.0 percentage points and shows 70.1-88.6% lower reported per-bug API cost where baseline costs are available. Ablations and diagnostics suggest that these gains are associated with the coordinated evidence-to-action chain, making agentic APR more effective and efficient.