🤖 AI Summary
Existing LLM-based vulnerability repair methods suffer from two critical bottlenecks: *location blindness*—neglecting precise vulnerability localization—and *iterative失控*—lacking rigorous patch quality assessment. This paper proposes a location-aware and taint-coverage-guided iterative repair paradigm. First, it synergistically integrates static vulnerability localization with Proof-of-Vulnerability (PoV) execution trace guidance to tightly constrain the repair scope. Second, it introduces a dual-dimensional patch evaluation mechanism that concurrently detects newly introduced vulnerabilities and quantifies taint statement coverage, thereby enhancing candidate patch filtering. The framework unifies large language models, static program analysis, dynamic taint tracking, and test-driven feedback for optimization. Evaluated on the VulnLoc+ dataset, our approach generates 27 plausible patches—surpassing state-of-the-art methods by 8–22 patches—and successfully repairs 8–13 previously unseen real-world C/C++ vulnerabilities.
📝 Abstract
The advances of large language models (LLMs) have paved the way for automated software vulnerability repair approaches, which iteratively refine the patch until it becomes plausible. Nevertheless, existing LLM-based vulnerability repair approaches face notable limitations: 1) they ignore the concern of locations that need to be patched and focus solely on the repair content. 2) they lack quality assessment for generated candidate patches in the iterative process.
To tackle the two limitations, we propose sysname, an LLM-based approach that provides information about where should be patched first. Furthermore, sysname improves the iterative repair strategy by assessing the quality of test-failing patches and selecting the best patch for the next iteration. We introduce two dimensions to assess the quality of patches: whether they introduce new vulnerabilities and the taint statement coverage. We evaluated sysname on a real-world C/C++ vulnerability repair dataset VulnLoc+, which contains 40 vulnerabilities and their Proofs-of-Vulnerability. The experimental results demonstrate that sysname exhibits substantial improvements compared with the Neural Machine Translation-based, Program Analysis-based, and LLM-based state-of-the-art vulnerability repair approaches. Specifically, sysname is able to generate 27 plausible patches, which is comparable to or even 8 to 22 more plausible patches than the baselines. In terms of correct patch generation, sysname repairs 8 to 13 additional vulnerabilities compared with existing approaches.