A Multi-Agent Approach to Fault Localization via Graph-Based Retrieval and Reflexion

📅 2024-09-20

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

To address low fault localization accuracy, poor long-context handling, and reliance on supervised training in complex software systems, this paper proposes LLM4FL—a fine-tuning-free multi-agent framework. It introduces a novel tri-agent collaboration paradigm: context extraction, graph-augmented debugging, and language-based reflective self-audit. Integrating sequence-sensitive chunking, graph-structured code navigation, retrieval-augmented reasoning, and reflective prompt chaining, LLM4FL overcomes LLM token limitations and scalability constraints. Evaluated on Defects4J v2.0, it achieves an 18.55% absolute improvement in Top-1 localization accuracy over AutoFL and outperforms supervised methods such as DeepFL and Grace. Ablation studies quantify the individual contributions of chunking and prompt chaining strategies, each yielding up to a 22% performance gain.

Technology Category

Application Category

📝 Abstract

Identifying and resolving software faults remains a challenging and resource-intensive process. Traditional fault localization techniques, such as Spectrum-Based Fault Localization (SBFL), leverage statistical analysis of test coverage but often suffer from limited accuracy. While learning-based approaches improve fault localization, they demand extensive training datasets and high computational resources. Recent advances in Large Language Models (LLMs) offer new opportunities by enhancing code understanding and reasoning. However, existing LLM-based fault localization techniques face significant challenges, including token limitations, performance degradation with long inputs, and scalability issues in complex software systems. To overcome these obstacles, we propose LLM4FL, a multi-agent fault localization framework that utilizes three specialized LLM agents. First, the Context Extraction Agent applies an order-sensitive segmentation strategy to partition large coverage data within the LLM's token limit, analyze failure context, and prioritize failure-related methods. The Debugger Agent then processes the extracted data, which employs graph-based retrieval-augmented code navigation to reason about failure causes and rank suspicious methods. Finally, the Reviewer Agent re-evaluates the identified faulty methods using verbal reinforcement learning, engaging in self-criticism and iterative refinement. Evaluated on the Defects4J (V2.0.0) benchmark, which includes 675 faults from 14 Java projects, LLM4FL achieves an 18.55% improvement in Top-1 accuracy over AutoFL and 4.82% over SoapFL. It outperforms supervised techniques such as DeepFL and Grace, all without requiring task-specific training. Furthermore, its coverage segmentation and prompt chaining strategies enhance performance, increasing Top-1 accuracy by up to 22%.

Problem

Research questions and friction points this paper is trying to address.

Improves fault localization accuracy in software systems.

Addresses limitations of existing LLM-based fault localization techniques.

Enhances scalability and performance in complex software environments.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework with specialized LLM agents

Graph-based retrieval-augmented code navigation

Verbal reinforcement learning for iterative refinement

🔎 Similar Papers

Failure Diagnosis in Microservice Systems: A Comprehensive Survey and Analysis