MAS-SZZ: Multi-Agentic SZZ Algorithm for Vulnerability-Inducing Commit Identification

📅 2026-04-27

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Existing SZZ algorithms struggle to accurately identify vulnerability-introducing commits due to biases in anchor selection and limited backtracking capabilities. This work proposes the first multi-agent collaborative SZZ framework, which integrates CVE descriptions and fix commits to precisely locate vulnerability-related code statements through a structured forward prompting strategy. These identified statements serve as anchors to automatically trace back through version history, enabling efficient identification of the initial commit that introduced the vulnerability. By synergistically combining large language models, prompt engineering, and code change intent analysis, the proposed method significantly outperforms current state-of-the-art approaches across multiple datasets and programming languages, achieving up to a 65.22% improvement in F1 score.

Technology Category

Application Category

📝 Abstract

Accurate vulnerability-inducing commit identification serves as a foundation for a series of software security tasks, such as vulnerability detection and affected version analysis. A straightforward solution is the SZZ algorithm, which traces back through the code history to identify the earliest commit that modify the vulnerable code. Unfortunately, neither the customized V-SZZ nor state-of-the-art LLM4SZZ perform satisfactorily due to the incorrect anchor selection and inadequate backtracking capability, making them far beyond a reliable usage in practice. To overcome these challenges, we propose a multi-agentic SZZ algorithm, named MAS-SZZ, that facilitates the identification of vulnerability-inducing commits through collaboration among agents. Specifically, given a CVE description and its corresponding fixing commit, MAS-SZZ summarizes the root cause of the vulnerability and employs a structured step-forward prompting strategy to localize vulnerability-related statements based on the change intent of each patch hunk. These vulnerable statements serve as anchors from which MAS-SZZ autonomously traces backward through the repository's history to find the commit that first introduced the vulnerability. Extensive experiments show that MAS-SZZ outperforms the state-of-the-art baselines across datasets and programming languages, achieving F1-score gains of up to 65.22% over the best-performing SZZ algorithm.

Problem

Research questions and friction points this paper is trying to address.

vulnerability-inducing commit

SZZ algorithm

anchor selection

backtracking capability

software security

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agentic

SZZ algorithm

vulnerability-inducing commit