🤖 AI Summary
Current LLM-based agents face two key bottlenecks in warehouse-scale software repair: (1) a lack of procedural knowledge—i.e., interpretable, principle-grounded steps for problem analysis, planning, and patch generation—and (2) reliance on computationally expensive, blind search. This paper introduces the first hierarchical procedural knowledge framework, constructed from historical repair data, which extracts multi-granularity repair steps and their underlying reasoning justifications. We further design a knowledge-driven online matching mechanism and strategy transfer module to provide structured, end-to-end guidance throughout the problem-solving process. Our approach significantly reduces search overhead, achieving a 74.6% resolution rate on SWE-bench Verified—outperforming five state-of-the-art methods. Ablation studies confirm that performance gains stem primarily from effective modeling and reuse of procedural knowledge.
📝 Abstract
Driven by the advancements of Large Language Models (LLMs), LLM-powered agents are making significant improvements in software engineering tasks, yet struggle with complex, repository-level issue resolution. Existing agent-based methods have two key limitations. First, they lack of procedural knowledge (i.e., how an issue is fixed step-by-step and rationales behind it) to learn and leverage for issue resolution. Second, they rely on massive computational power to blindly explore the solution space. %
To address those limitations, we propose Lingxi, an issue resolution framework that leverages procedural knowledge extracted from historical issue-fixing data to guide agents in solving repository-level issues. ourTool first constructs this knowledge offline through a hierarchical abstraction mechanism, enabling agents to learn the how and why behind a fix, not just the final solution. During online application, it employs a knowledge-driven scaling method that leverages the procedural knowledge of similar issues to intelligently analyze the target issue from multiple perspectives, in sharp contrast to undirected, brute-force exploration. %
Lingxi successfully resolves 74.6% of bugs on the SWE-bench Verified benchmark in Past@1 setting, outperforming five state-of-the-art techniques by a significant margin (5.4% to 14.9%). Our comprehensive ablation study confirmed that the success of Lingxi comes directly from its use of procedural knowledge. Without it, the performance gains from scaling alone is negligible. Our qualitative study further shows that the ``design patterns $&$ coding practices'' is the most critical knowledge aspect, and that the roles of different knowledge aspects switch across different stages (i.e., analysis, planning, and fixing).