LLMs' Suitability for Network Security: A Case Study of STRIDE Threat Modeling

📅 2025-05-07

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This study addresses the lack of empirical evaluation of large language models (LLMs) in STRIDE threat modeling, particularly for structured threat classification in 5G network security. Method: We systematically assess five state-of-the-art LLMs—including GPT-4, Claude, and Llama—using a novel prompt engineering framework integrating few-shot learning, chain-of-thought reasoning, role-based prompting, and strict output formatting constraints. Contribution/Results: Results reveal substantial performance disparities across STRIDE categories (41%–89% accuracy for Spoofing, Tampering, etc.), exposing inherent model biases and domain knowledge gaps. Performance is shown to depend jointly on threat semantic complexity and training data distribution. We propose a cybersecurity-oriented LLM optimization framework, empirically validating that domain-adapted prompting and lightweight fine-tuning significantly enhance classification robustness. This work establishes a methodological foundation and practical pathway for deploying LLMs in automated, scalable threat modeling.

Technology Category

Application Category

📝 Abstract

Artificial Intelligence (AI) is expected to be an integral part of next-generation AI-native 6G networks. With the prevalence of AI, researchers have identified numerous use cases of AI in network security. However, there are almost nonexistent studies that analyze the suitability of Large Language Models (LLMs) in network security. To fill this gap, we examine the suitability of LLMs in network security, particularly with the case study of STRIDE threat modeling. We utilize four prompting techniques with five LLMs to perform STRIDE classification of 5G threats. From our evaluation results, we point out key findings and detailed insights along with the explanation of the possible underlying factors influencing the behavior of LLMs in the modeling of certain threats. The numerical results and the insights support the necessity for adjusting and fine-tuning LLMs for network security use cases.

Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' suitability for network security tasks

Evaluating LLMs in STRIDE threat modeling for 5G

Identifying need for LLM adjustments in security applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizing four prompting techniques with LLMs

Performing STRIDE classification of 5G threats

Adjusting and fine-tuning LLMs for security

🔎 Similar Papers

Large Language Models for Cyber Security: A Systematic Literature Review