LLM-Enabled Open-Source Systems in the Wild: An Empirical Study of Vulnerabilities in GitHub Security Advisories

📅 2026-04-05

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This study addresses the limitations of traditional vulnerability disclosure frameworks in adequately capturing novel, model-mediated security risks inherent in large language model (LLM)-integrated systems. Through empirical analysis of 295 security advisories on GitHub from January 2025 to January 2026 that involve LLM components—and manual annotation of 100 of these using both CWE and the OWASP LLM Top 10 2025—it systematically uncovers co-occurrence patterns of architectural-level risks such as prompt injection, over-delegation, and supply chain contamination. The findings reveal that while most vulnerabilities can be mapped to existing CWE categories (e.g., injection, deserialization), their underlying model-mediated mechanisms are frequently overlooked by current disclosure practices, thereby validating the necessity and effectiveness of integrating dual perspectives from CWE and the OWASP LLM Top 10.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly embedded in open-source software (OSS) ecosystems, creating complex interactions among natural language prompts, probabilistic model outputs, and execution-capable components. However, it remains unclear whether traditional vulnerability disclosure frameworks adequately capture these model-mediated risks. To investigate this, we analyze 295 GitHub Security Advisories published between January 2025 and January 2026 that reference LLM-related components, and we manually annotate a sample of 100 advisories using the OWASP Top 10 for LLM Applications 2025. We find no evidence of new implementation-level weakness classes specific to LLM systems. Most advisories map to established CWEs, particularly injection and deserialization weaknesses. At the same time, the OWASP-based analysis reveals recurring architectural risk patterns, especially Supply Chain, Excessive Agency, and Prompt Injection, which often co-occur across multiple stages of execution. These results suggest that existing advisory metadata captures code-level defects but underrepresents model-mediated exposure. We conclude that combining the CWE and OWASP perspectives provides a more complete and necessary view of vulnerabilities in LLM-integrated systems.

Problem

Research questions and friction points this paper is trying to address.

LLM vulnerabilities

GitHub Security Advisories

OWASP Top 10 for LLM

model-mediated risks

open-source software security

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM security

GitHub Security Advisories

OWASP Top 10 for LLM