A Systematic Literature Review on Detecting Software Vulnerabilities with Large Language Models

📅 2025-07-30

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Current LLM-based software vulnerability detection research is fragmented and lacks comparability, hindering progress assessment and standardization. Method: We conduct a systematic literature review (SLR) of 227 studies published between 2020 and 2025, performing multi-dimensional coding across task definitions, input representations, architectural designs, adaptation techniques, and dataset characteristics. Contribution/Results: We propose the first fine-grained taxonomy for LLM-based vulnerability detection; systematically uncover critical limitations—including dataset-level vulnerability coverage bias, model adaptation bottlenecks, and inconsistent evaluation protocols; and release an open, continuously updated knowledge base containing all coded data and analysis frameworks. This resource establishes reproducible benchmarks and practical guidelines, significantly enhancing transparency, rigor, and cross-study comparability in the field.

Technology Category

Application Category

📝 Abstract

The increasing adoption of Large Language Models (LLMs) in software engineering has sparked interest in their use for software vulnerability detection. However, the rapid development of this field has resulted in a fragmented research landscape, with diverse studies that are difficult to compare due to differences in, e.g., system designs and dataset usage. This fragmentation makes it difficult to obtain a clear overview of the state-of-the-art or compare and categorize studies meaningfully. In this work, we present a comprehensive systematic literature review (SLR) of LLM-based software vulnerability detection. We analyze 227 studies published between January 2020 and June 2025, categorizing them by task formulation, input representation, system architecture, and adaptation techniques. Further, we analyze the datasets used, including their characteristics, vulnerability coverage, and diversity. We present a fine-grained taxonomy of vulnerability detection approaches, identify key limitations, and outline actionable future research opportunities. By providing a structured overview of the field, this review improves transparency and serves as a practical guide for researchers and practitioners aiming to conduct more comparable and reproducible research. We publicly release all artifacts and maintain a living repository of LLM-based software vulnerability detection studies.

Problem

Research questions and friction points this paper is trying to address.

Fragmented research on LLM-based vulnerability detection methods

Lack of standardized comparison due to diverse study designs

Need for structured taxonomy and reproducible research guidelines

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic literature review on LLM-based vulnerability detection

Taxonomy of approaches by task and architecture

Public repository for reproducible research

🔎 Similar Papers

No similar papers found.