🤖 AI Summary
Large language models (LLMs) exhibit limited adaptability to emerging cyber threats, suffer from opaque reasoning processes, and lack verifiable trustworthiness in dynamic security environments. To address these challenges, this paper proposes a retrieval-augmented generation (RAG) framework tailored for dynamic cybersecurity scenarios. The core contribution is an optimized hybrid retrieval mechanism that integrates real-time, multi-source threat intelligence—including CVE, MITRE ATT&CK, and operational threat feeds—to enhance LLMs’ temporal reasoning and contextual understanding of novel attack patterns. We employ Llama-3-8B-Instruct as the foundation model, enabling knowledge dynamism and generating interpretable, traceable responses. Experimental evaluation on threat detection tasks demonstrates significant improvements over baselines: +12.7% accuracy gain and enhanced output consistency. The framework substantially improves model adaptability to rapidly evolving threat landscapes while strengthening reliability and explainability in security-critical applications.
📝 Abstract
Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often limits trust, particularly in decisions that require domain-specific cybersecurity knowledge. Because security threats evolve rapidly, LLMs must not only recall historical incidents but also adapt to emerging vulnerabilities and attack patterns. Retrieval-Augmented Generation (RAG) has demonstrated effectiveness in general LLM applications, but its potential for cybersecurity remains underexplored. In this work, we introduce a RAG-based framework designed to contextualize cybersecurity data and enhance LLM accuracy in knowledge retention and temporal reasoning. Using external datasets and the Llama-3-8B-Instruct model, we evaluate baseline RAG, an optimized hybrid retrieval approach, and conduct a comparative analysis across multiple performance metrics. Our findings highlight the promise of hybrid retrieval in strengthening the adaptability and reliability of LLMs for cybersecurity tasks.