Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation

📅 2025-10-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit limited adaptability to emerging cyber threats, suffer from opaque reasoning processes, and lack verifiable trustworthiness in dynamic security environments. To address these challenges, this paper proposes a retrieval-augmented generation (RAG) framework tailored for dynamic cybersecurity scenarios. The core contribution is an optimized hybrid retrieval mechanism that integrates real-time, multi-source threat intelligence—including CVE, MITRE ATT&CK, and operational threat feeds—to enhance LLMs’ temporal reasoning and contextual understanding of novel attack patterns. We employ Llama-3-8B-Instruct as the foundation model, enabling knowledge dynamism and generating interpretable, traceable responses. Experimental evaluation on threat detection tasks demonstrates significant improvements over baselines: +12.7% accuracy gain and enhanced output consistency. The framework substantially improves model adaptability to rapidly evolving threat landscapes while strengthening reliability and explainability in security-critical applications.

Technology Category

Application Category

📝 Abstract
Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often limits trust, particularly in decisions that require domain-specific cybersecurity knowledge. Because security threats evolve rapidly, LLMs must not only recall historical incidents but also adapt to emerging vulnerabilities and attack patterns. Retrieval-Augmented Generation (RAG) has demonstrated effectiveness in general LLM applications, but its potential for cybersecurity remains underexplored. In this work, we introduce a RAG-based framework designed to contextualize cybersecurity data and enhance LLM accuracy in knowledge retention and temporal reasoning. Using external datasets and the Llama-3-8B-Instruct model, we evaluate baseline RAG, an optimized hybrid retrieval approach, and conduct a comparative analysis across multiple performance metrics. Our findings highlight the promise of hybrid retrieval in strengthening the adaptability and reliability of LLMs for cybersecurity tasks.
Problem

Research questions and friction points this paper is trying to address.

Adapting LLMs to evolving cybersecurity threats
Enhancing LLM accuracy in threat detection
Addressing opaque reasoning in security applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

RAG framework contextualizes cybersecurity data for LLMs
Hybrid retrieval enhances LLM adaptability to emerging threats
Optimized approach improves accuracy in temporal reasoning
🔎 Similar Papers
A
Arnabh Borah
School of Electrical and Computer Engineering, Georgia Institute of Technology
M
Md Tanvirul Alam
Department of Computer Science, Rochester Institute of Technology
Nidhi Rastogi
Nidhi Rastogi
Assistant Professor, Rochester Institute of Technology, NY
CybersecurityArtificial IntelligenceAutonomous VehiclesGraph AnalyticsApplied Machine Learning