🤖 AI Summary
To address the efficiency bottleneck in Automated Program Repair (APR) caused by the explosive growth of vulnerabilities and the difficulty of root-cause analysis, this paper proposes a lightweight, crash-point–driven repair paradigm that bypasses traditional root-cause localization and directly synthesizes safe patches at observable error sites. Our approach introduces four key contributions: (1) the first crash-point–driven APR framework; (2) a template-guided, low-overhead LLM-based patch generation mechanism that significantly reduces token consumption; (3) a lightweight inference architecture deployable on consumer-grade hardware (e.g., Apple Mac M4 Mini); and (4) synergistic optimization with the CodeRover-S agent. Evaluated on the ARVO benchmark, our method achieves a 73.5% repair rate—representing a 29.6-percentage-point improvement over prior work—while reducing token cost by 45.9% and enabling 7,400 vulnerability repairs per dollar of compute.
📝 Abstract
The rapid advancement of bug-finding techniques has led to the discovery of more vulnerabilities than developers can reasonably fix, creating an urgent need for effective Automated Program Repair (APR) methods. However, the complexity of modern bugs often makes precise root cause analysis difficult and unreliable. To address this challenge, we propose crash-site repair to simplify the repair task while still mitigating the risk of exploitation. In addition, we introduce a template-guided patch generation approach that significantly reduces the token cost of Large Language Models (LLMs) while maintaining both efficiency and effectiveness. We implement our prototype system, WILLIAMT, and evaluate it against state-of-the-art APR tools. Our results show that, when combined with the top-performing agent CodeRover-S, WILLIAMT reduces token cost by 45.9% and increases the bug-fixing rate to 73.5% (+29.6%) on ARVO, a ground-truth open source software vulnerabilities benchmark. Furthermore, we demonstrate that WILLIAMT can function effectively even without access to frontier LLMs: even a local model running on a Mac M4 Mini achieves a reasonable repair rate. These findings highlight the broad applicability and scalability of WILLIAMT.