A Preliminary Study of Large Language Models for Multilingual Vulnerability Detection

๐Ÿ“… 2025-05-12
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the language-specific limitations of existing deep learningโ€“based vulnerability detection methods, which hinder cross-language generalization. We present the first systematic evaluation of pretrained language models (PLMs) and large language models (LLMs) for multilingual vulnerability detection across seven mainstream programming languages. Using fine-tuning and zero-shot prompting strategies on a unified multilingual code dataset, we comparatively assess CodeT5P, LLaMA, Qwen, and other representative models. Results demonstrate that CodeT5P significantly outperforms prior approaches in overall detection accuracy and high-severity vulnerability identification. Crucially, its strong cross-language transfer capability validates the effectiveness of universal code representations for vulnerability detection. This study provides empirical evidence and a concrete technical pathway toward building industrial-grade, multilingual-compatible automated security analysis tools.

Technology Category

Application Category

๐Ÿ“ Abstract
Deep learning-based approaches, particularly those leveraging pre-trained language models (PLMs), have shown promise in automated software vulnerability detection. However, existing methods are predominantly limited to specific programming languages, restricting their applicability in multilingual settings. Recent advancements in large language models (LLMs) offer language-agnostic capabilities and enhanced semantic understanding, presenting a potential solution to this limitation. While existing studies have explored LLMs for vulnerability detection, their detection performance remains unknown for multilingual vulnerabilities. To address this gap, we conducted a preliminary study to evaluate the effectiveness of PLMs and state-of-the-art LLMs across seven popular programming languages. Our findings reveal that the PLM CodeT5P achieves the best performance in multilingual vulnerability detection, particularly in identifying the most critical vulnerabilities. Based on these results, we further discuss the potential of LLMs in advancing real-world multilingual vulnerability detection. This work represents an initial step toward exploring PLMs and LLMs for cross-language vulnerability detection, offering key insights for future research and practical deployment.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs for multilingual vulnerability detection across languages
Assessing performance gaps in existing language-specific vulnerability detection methods
Exploring CodeT5P's superiority in critical multilingual vulnerability identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging large language models for multilingual vulnerability detection
Evaluating PLMs and LLMs across seven programming languages
CodeT5P achieves best performance in critical vulnerability detection
๐Ÿ”Ž Similar Papers
No similar papers found.
J
Junji Yu
College of Intelligence and Computing, Tianjin University
Honglin Shu
Honglin Shu
Kyushu University
AI4SE
Michael Fu
Michael Fu
The University of Melbourne
Software EngineeringDevSecOpsDeep LearningLanguage Models
D
Dong Wang
College of Intelligence and Computing, Tianjin University
C
Chakkrit Tantithamthavorn
Information Technology, Monash University
Yasutaka Kamei
Yasutaka Kamei
Professor, Kyushu University, InaRIS Fellow
Software EngineeringEmpirical Software EngineeringMining Software RepositoriesSoftware Quality
J
Junjie Chen
College of Intelligence and Computing, Tianjin University