LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures

📅 2025-05-02

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Large language models (LLMs) face pervasive security risks across their full lifecycle, yet existing work lacks a systematic taxonomy distinguishing training-phase threats (e.g., data poisoning, backdoor injection) from deployment-phase threats (e.g., prompt injection, jailbreaking). Method: We propose the first “training-phase/deployment-phase” dichotomous attack framework, grounded in rigorous security threat modeling and empirical attack pattern analysis. We design a dual-track defense architecture—comprising preventive and detection-oriented mechanisms—and establish a defense strategy taxonomy with formalized effectiveness evaluation criteria and threat-specific defense boundaries. Contribution/Results: Our work yields a structured LLM security knowledge graph identifying six core vulnerability classes and their corresponding mitigation pathways. It further pinpoints five critical open challenges in LLM defense research. Collectively, this provides both a theoretical foundation and actionable guidance for LLM security governance.

Technology Category

Application Category

📝 Abstract

As large language models (LLMs) continue to evolve, it is critical to assess the security threats and vulnerabilities that may arise both during their training phase and after models have been deployed. This survey seeks to define and categorize the various attacks targeting LLMs, distinguishing between those that occur during the training phase and those that affect already trained models. A thorough analysis of these attacks is presented, alongside an exploration of defense mechanisms designed to mitigate such threats. Defenses are classified into two primary categories: prevention-based and detection-based defenses. Furthermore, our survey summarizes possible attacks and their corresponding defense strategies. It also provides an evaluation of the effectiveness of the known defense mechanisms for the different security threats. Our survey aims to offer a structured framework for securing LLMs, while also identifying areas that require further research to improve and strengthen defenses against emerging security challenges.

Problem

Research questions and friction points this paper is trying to address.

Identify security threats in LLM training and deployment phases

Categorize attacks on LLMs and analyze defense mechanisms

Evaluate effectiveness of defenses against emerging LLM vulnerabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Categorizes attacks on LLMs by training and deployment phases

Classifies defenses into prevention and detection based strategies

Evaluates effectiveness of known defense mechanisms for LLM security

🔎 Similar Papers

Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices