How and Why Agents Can Identify Bug-Introducing Commits

📅 2026-03-31

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Accurately pinpointing the code commits that introduce software defects has long been hindered by a performance ceiling in precision. This work proposes a novel workflow leveraging large language model (LLM) agents to automatically generate concise, grep-friendly search patterns by analyzing fix-commits, thereby enabling efficient identification of defect-introducing commits within candidate sets. For the first time, this approach demonstrates that LLM agents can effectively distill high-precision code change patterns, substantially advancing localization accuracy. Evaluated on mainstream Linux kernel datasets, the method achieves an F1-score of 0.81, a significant improvement over the previous state-of-the-art score of 0.64—surpassing the cumulative gains of all prior approaches over the past two decades and breaking through a longstanding performance bottleneck in the field.

Technology Category

Application Category

📝 Abstract

Śliwerski, Zimmermann, and Zeller (SZZ) just won the 2026 ACM SIGSOFT Impact Award for asking: When do changes induce fixes? Their paper from 2005 served as the foundation for a wide array of approaches aimed at identifying bug-introducing changes (or commits) from fix commits in software repositories. But even after two decades of progress, the best-performing approach from 2025 yields a modest increase of 10 percentage points in F1-score on the most popular Linux kernel dataset. In this paper, we uncover how and why LLM-based agents can substantially advance the state-of-the-art in identifying bug-introducing commits from fix commits. We propose a simple agentic workflow based on searching a set of candidate commits and find that it raises the F1-score from 0.64 to 0.81 on the most popular Linux kernel dataset, a bigger jump than between the original 2005 method (0.54) and the previous SOTA (0.64). We also uncover why agents are so successful: They derive short greppable patterns from the fix commit diff and message and use them to effectively search and find bug-introducing commits in large candidate sets. Finally, we also discuss how these insights might enable further progress in bug detection, root cause understanding, and repair.

Problem

Research questions and friction points this paper is trying to address.

bug-introducing commits

fix commits

software repositories

defect identification

commit analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based agents

bug-introducing commits

agentic workflow