π€ AI Summary
Existing LLM-based fault localization methods rely solely on general-purpose capabilities and lack project-specific knowledge, limiting their effectiveness in complex software projects. To address this, we propose MemFLβa novel framework introducing a dual-mode external memory mechanism that jointly incorporates static project summaries (e.g., code structure, API usage) and dynamic iterative debugging memories (e.g., historical repair feedback, error context). MemFL decomposes fault localization into a three-stage streaming inference process, enabling LLMs to adaptively localize faults in complex projects without fine-tuning. Leveraging GPT-4o-mini/GPT-4.1-mini, static analysis, memory-augmented retrieval, and context-enhanced prompting, MemFL achieves a 12.7% improvement in defect detection rate on Defects4J, reduces average localization time to 17.4 seconds per defect (β21%), lowers cost to $0.0033 per defect (β33%), and boosts performance on complex projects by up to 27.6%.
π Abstract
Fault localization, the process of identifying the software components responsible for failures, is essential but often time-consuming. Recent advances in Large Language Models (LLMs) have enabled fault localization without extensive defect datasets or model fine-tuning. However, existing LLM-based methods rely only on general LLM capabilities and lack integration of project-specific knowledge, resulting in limited effectiveness, especially for complex software. We introduce MemFL, a novel approach that enhances LLM-based fault localization by integrating project-specific knowledge via external memory. This memory includes static summaries of the project and dynamic, iterative debugging insights gathered from previous attempts. By leveraging external memory, MemFL simplifies debugging into three streamlined steps, significantly improving efficiency and accuracy. Iterative refinement through dynamic memory further enhances reasoning quality over time. Evaluated on the Defects4J benchmark, MemFL using GPT-4o-mini localized 12.7% more bugs than current LLM-based methods, achieving this improvement with just 21% of the execution time (17.4 seconds per bug) and 33% of the API cost (0.0033 dollars per bug). On complex projects, MemFL's advantage increased to 27.6%. Additionally, MemFL with GPT-4.1-mini outperformed existing methods by 24.4%, requiring only 24.7 seconds and 0.0094 dollars per bug. MemFL thus demonstrates significant improvements by effectively incorporating project-specific knowledge into LLM-based fault localization, delivering high accuracy with reduced time and cost.