RGFL: Reasoning Guided Fault Localization for Automated Program Repair Using Large Language Models

📅 2026-01-25

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses the challenge of precise fault localization in large-scale codebases for large language model (LLM)-driven program repair, where existing approaches are constrained by limited context windows. The authors propose a hierarchical reasoning-guided fault localization method that first generates structured error explanations for candidate files and code elements, then integrates LLM-based reasoning with embedding signals in a two-stage ranking process to achieve high-precision localization. A novel counterfactual upper-bound analysis is introduced to quantitatively assess the contribution of each stage to repair success. Evaluated on SWE-bench Verified, the approach achieves an 85% file-level Hit@1 (+13.6%), an MRR of 88.8%, a 69% element-level Top-3 exact match rate (+33%), and a 12.8% improvement in end-to-end repair success rate.

Technology Category

Application Category

📝 Abstract

Fault Localization (FL) is a critical step in Automated Program Repair (APR), and its importance has increased with the rise of Large Language Model (LLM)-based repair agents. In realistic project-level repair scenarios, software repositories often span millions of tokens, far exceeding current LLM context limits. Consequently, models must first identify a small, relevant subset of code, making accurate FL essential for effective repair. We present a novel project-level FL approach that improves both file- and element-level localization. Our method introduces a hierarchical reasoning module that (i) generates structured, bug-specific explanations for candidate files and elements, and (ii) leverages these explanations in a two-stage ranking scheme combining LLM-based and embedding-based signals. We further propose a counterfactual upper-bound analysis to quantify the contribution of each localization stage to repair success. We evaluate our approach on Python and Java projects from SWE-bench Verified, Lite, and Java. Compared to state-of-the-art baselines, including Agentless and OpenHands, our method consistently improves localization accuracy. On SWE-bench Verified, file-level Hit@1 improves from 71.4% to 85%, and MRR from 81.8% to 88.8%. At the element level, Exact Match under top-3 files increases from 36% to 69%. Integrating our localization into Agentless yields a 12.8% end-to-end repair success improvement.

Problem

Research questions and friction points this paper is trying to address.

Fault Localization

Automated Program Repair

Large Language Models

Project-level Repair

Code Localization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reasoning Guided Fault Localization

Large Language Models

Automated Program Repair