🤖 AI Summary
Existing code localization methods ignore developers’ long-term memory of codebases—such as module functionality and defect–fix location associations—forcing task-specific reasoning from scratch. Method: We propose a non-parametric, repository-level long-term memory mechanism grounded in commit history. By analyzing historical commits, issue–commit links, and active module summaries, it automatically discovers code evolution patterns and defect distribution regularities, constructing a retrievable external memory store. This memory is integrated into the LocAgent framework for the first time, enabling memory-augmented localization reasoning. Contribution/Results: Our approach achieves significant improvements over state-of-the-art methods on both SWE-bench-verified and SWE-bench-live benchmarks, empirically demonstrating that long-term memory is critical for enhancing performance in complex software engineering tasks.
📝 Abstract
Code localization is a fundamental challenge in repository-level software engineering tasks such as bug fixing. While existing methods equip language agents with comprehensive tools/interfaces to fetch information from the repository, they overlook the critical aspect of memory, where each instance is typically handled from scratch assuming no prior repository knowledge. In contrast, human developers naturally build long-term repository memory, such as the functionality of key modules and associations between various bug types and their likely fix locations. In this work, we augment language agents with such memory by leveraging a repository's commit history - a rich yet underutilized resource that chronicles the codebase's evolution. We introduce tools that allow the agent to retrieve from a non-parametric memory encompassing recent historical commits and linked issues, as well as functionality summaries of actively evolving parts of the codebase identified via commit patterns. We demonstrate that augmenting such a memory can significantly improve LocAgent, a state-of-the-art localization framework, on both SWE-bench-verified and the more recent SWE-bench-live benchmarks. Our research contributes towards developing agents that can accumulate and leverage past experience for long-horizon tasks, more closely emulating the expertise of human developers.