🤖 AI Summary
To address the limited interpretability and poor timeliness of conventional threat intelligence methods, this paper proposes an end-to-end framework integrating domain-adaptive fine-tuning of large language models (LLMs) with retrieval-augmented generation (RAG). The framework employs a vector database for dynamic, real-time retrieval of domain-specific cybersecurity knowledge and fine-tunes the LLM on security-oriented corpora to ensure transparent reasoning and up-to-date knowledge. Its key innovation lies in deeply embedding the RAG mechanism within the threat analysis pipeline, thereby significantly improving detection accuracy for zero-day threats and enhancing attribution traceability. Experimental results demonstrate that the method outperforms baseline models across multiple threat intelligence tasks. Generated intelligence reports exhibit high readability, strong interpretability, and operational responsiveness—with average inference latency under 3 seconds.
📝 Abstract
As cyber threats continue to grow in complexity, traditional defense mechanisms often struggle to keep up. Large Language Models (LLMs) offer significant potential in cybersecurity due to their advanced capabilities in text comprehension and generation. This research explores the use of LLMs through Retrieval-Augmented Generation (RAG) to enhance cybersecurity functions by combining real-time information retrieval with domain-specific data for more accurate and timely Cyber Threat Intelligence (CTI).