🤖 AI Summary
File-level defect localization remains a critical bottleneck in software maintenance, and existing retrieval-augmented generation (RAG) approaches suffer from insufficient reasoning precision due to their reliance on static retrieval. This work proposes BLAgent, an agent-based RAG framework tailored for defect localization, which innovatively integrates path-augmented AST chunking, dual-perspective query transformation capturing both structural and behavioral semantics, and a two-stage reranking mechanism combining symbolic validation with evidence-based reasoning. Evaluated on SWE-bench Lite, BLAgent achieves Top-1 accuracy of 78% with open-source models and 86% with closed-source models, reduces inference cost by over 18×, and boosts the success rate of automated program repair by more than 20%.
📝 Abstract
Bug localization remains a key bottleneck in downstream software maintenance tasks, including root cause analysis, triage, and automated program repair (APR), despite recent advances in large language model (LLM)-based repair systems. File-level bug localization is especially critical in hierarchical pipelines, where errors can propagate to downstream stages such as statement-level localization or patch generation. While Retrieval-Augmented Generation (RAG) offers a promising direction for grounding LLMs in repository context, existing RAG pipelines rely on static retrieval and lack the reasoning needed to identify faulty code accurately. In this work, we present BLAgent, a novel agentic RAG framework for file-level bug localization that integrates three key ideas: (i) code structure-aware repository encoding with path-augmented AST-based chunking, (ii) dual-perspective query transformation capturing both structural and behavioral signals, and (iii) two-phase agentic reranking combining symbolic inspection with evidence-grounded reasoning. Unlike prior graph-based or multi-hop agentic approaches, BLAgent performs bounded reasoning over a compact candidate set, balancing accuracy and cost. On SWE-bench Lite, BLAgent attains over 78% Top-1 accuracy with open-source models and over 86% with a closed-source model, while being over 18x cheaper than the strongest baseline using the same model. When integrated into an APR framework, it improves end-to-end repair success by over 20%.