KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment

📅 2024-08-15

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

156K/year

🤖 AI Summary

To address the challenge of labor-intensive, low-automation credibility assessment of Cyber Threat Intelligence (CTI), this paper proposes the Knowledge Graph Verification (KGV) framework. KGV leverages large language models (LLMs) to automatically extract key claims from CTI documents and constructs a paragraph-level knowledge graph—where paragraphs serve as nodes and semantic similarity as edges—to enable end-to-end fact-checking and joint reasoning. Its main contributions are: (1) introducing the first paragraph-level knowledge graph modeling paradigm for CTI; (2) releasing the first publicly available, cross-source CTI credibility evaluation dataset; and (3) achieving high-quality verification without extensive manual annotation. Experiments demonstrate that KGV significantly enhances LLM performance on CTI quality assessment, achieving an accuracy of XXX while substantially reducing annotation costs.

Technology Category

Application Category

📝 Abstract

Cyber threat intelligence is a critical tool that many organizations and individuals use to protect themselves from sophisticated, organized, persistent, and weaponized cyber attacks. However, few studies have focused on the quality assessment of threat intelligence provided by intelligence platforms, and this work still requires manual analysis by cybersecurity experts. In this paper, we propose a knowledge graph-based verifier, a novel Cyber Threat Intelligence (CTI) quality assessment framework that combines knowledge graphs and Large Language Models (LLMs). Our approach introduces LLMs to automatically extract OSCTI key claims to be verified and utilizes a knowledge graph consisting of paragraphs for fact-checking. This method differs from the traditional way of constructing complex knowledge graphs with entities as nodes. By constructing knowledge graphs with paragraphs as nodes and semantic similarity as edges, it effectively enhances the semantic understanding ability of the model and simplifies labeling requirements. Additionally, to fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources. To the best of our knowledge, this work is the first to create a dataset on threat intelligence reliability verification, providing a reference for future research. Experimental results show that KGV (Knowledge Graph Verifier) significantly improves the performance of LLMs in intelligence quality assessment. Compared with traditional methods, we reduce a large amount of data annotation while the model still exhibits strong reasoning capabilities. Finally, our method can achieve XXX accuracy in network threat assessment.

Problem

Research questions and friction points this paper is trying to address.

Automating cyber threat intelligence credibility assessment

Integrating LLMs with knowledge graphs for CTI analysis

Reducing manual expert effort in CTI credibility evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates LLMs with semantic knowledge graphs

Uses paragraph-level nodes for enhanced understanding

Reduces node quantity and boosts precision

🔎 Similar Papers

No similar papers found.