One Bug, Hundreds Behind: LLMs for Large-Scale Bug Discovery

📅 2025-10-15

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Repetitive Pattern Bugs (RPBs)—security vulnerabilities repeatedly introduced across code segments or programs due to a common root cause and left unpatched—pose significant yet underdetected threats in large-scale software. Method: This paper introduces BugStone, the first system to synergistically integrate LLVM-based static analysis with large language models (LLMs) to extract semantic-level error patterns from patched vulnerabilities, enabling whole-program RPB detection via pattern matching and semantic similarity assessment. Contribution/Results: BugStone proactively uncovers implicit vulnerability propagation risks obscured in security advisories. Evaluated on the Linux kernel, it identified over 22,000 previously unknown defects; manual validation of 400 samples confirmed 246 true positives. On a benchmark dataset of 1,900 known vulnerabilities, BugStone achieved 92.2% precision and 79.1% pairwise accuracy—substantially advancing the detection of stealthy, widespread vulnerabilities in complex software systems.

Technology Category

Application Category

📝 Abstract

Fixing bugs in large programs is a challenging task that demands substantial time and effort. Once a bug is found, it is reported to the project maintainers, who work with the reporter to fix it and eventually close the issue. However, across the program, there are often similar code segments, which may also contain the bug, but were missed during discovery. Finding and fixing each recurring bug instance individually is labor intensive. Even more concerning, bug reports can inadvertently widen the attack surface as they provide attackers with an exploitable pattern that may be unresolved in other parts of the program. In this paper, we explore these Recurring Pattern Bugs (RPBs) that appear repeatedly across various code segments of a program or even in different programs, stemming from a same root cause, but are unresolved. Our investigation reveals that RPBs are widespread and can significantly compromise the security of software programs. This paper introduces BugStone, a program analysis system empowered by LLVM and a Large Language Model (LLM). The key observation is that many RPBs have one patched instance, which can be leveraged to identify a consistent error pattern, such as a specific API misuse. By examining the entire program for this pattern, it is possible to identify similar sections of code that may be vulnerable. Starting with 135 unique RPBs, BugStone identified more than 22K new potential issues in the Linux kernel. Manual analysis of 400 of these findings confirmed that 246 were valid. We also created a dataset from over 1.9K security bugs reported by 23 recent top-tier conference works. We manually annotate the dataset, identify 80 recurring patterns and 850 corresponding fixes. Even with a cost-efficient model choice, BugStone achieved 92.2% precision and 79.1% pairwise accuracy on the dataset.

Problem

Research questions and friction points this paper is trying to address.

Identifying recurring pattern bugs across large codebases efficiently

Detecting similar vulnerable code segments using patched instances as patterns

Automating large-scale bug discovery to prevent security vulnerabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages LLM to identify recurring bug patterns

Uses patched instances to detect similar vulnerabilities

Analyzes entire programs for consistent error patterns

🔎 Similar Papers

A Systematic Literature Review on Large Language Models for Automated Program Repair