🤖 AI Summary
This paper addresses the security risks arising from the “dual-use” nature of large language models (LLMs) in cybersecurity red-teaming and blue-teaming operations. It systematically identifies reliability deficiencies stemming from LLMs’ intrinsic limitations—such as hallucination, context short-sightedness, and weak reasoning—amplified by integration-level threats including dual-use misuse, adversarial prompting, and insufficient human oversight. To address this, the paper introduces the first “capability–risk” co-analysis framework, integrating MITRE ATT&CK and NIST Cybersecurity Framework (CSF) mappings, adversarial prompt testing, threat modeling, and human-factor analysis to quantitatively assess LLM performance boundaries across tasks like automated reconnaissance, phishing email generation, and log analysis. The study yields four actionable security-by-design principles: human-in-the-loop decision-making, enhanced model interpretability, privacy-preserving architecture, and dynamic risk auditing—providing both theoretical grounding and practical guidance for securely deploying LLMs in adversarial cybersecurity contexts.
📝 Abstract
Large Language Models (LLMs) are set to reshape cybersecurity by augmenting red and blue team operations. Red teams can exploit LLMs to plan attacks, craft phishing content, simulate adversaries, and generate exploit code. Conversely, blue teams may deploy them for threat intelligence synthesis, root cause analysis, and streamlined documentation. This dual capability introduces both transformative potential and serious risks. This position paper maps LLM applications across cybersecurity frameworks such as MITRE ATT&CK and the NIST Cybersecurity Framework (CSF), offering a structured view of their current utility and limitations. While LLMs demonstrate fluency and versatility across various tasks, they remain fragile in high-stakes, context-heavy environments. Key limitations include hallucinations, limited context retention, poor reasoning, and sensitivity to prompts, which undermine their reliability in operational settings. Moreover, real-world integration raises concerns around dual-use risks, adversarial misuse, and diminished human oversight. Malicious actors could exploit LLMs to automate reconnaissance, obscure attack vectors, and lower the technical threshold for executing sophisticated attacks. To ensure safer adoption, we recommend maintaining human-in-the-loop oversight, enhancing model explainability, integrating privacy-preserving mechanisms, and building systems robust to adversarial exploitation. As organizations increasingly adopt AI driven cybersecurity, a nuanced understanding of LLMs' risks and operational impacts is critical to securing their defensive value while mitigating unintended consequences.