KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment

๐Ÿ“… 2024-08-15
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the challenge of labor-intensive, low-automation credibility assessment of Cyber Threat Intelligence (CTI), this paper proposes the Knowledge Graph Verification (KGV) framework. KGV leverages large language models (LLMs) to automatically extract key claims from CTI documents and constructs a paragraph-level knowledge graphโ€”where paragraphs serve as nodes and semantic similarity as edgesโ€”to enable end-to-end fact-checking and joint reasoning. Its main contributions are: (1) introducing the first paragraph-level knowledge graph modeling paradigm for CTI; (2) releasing the first publicly available, cross-source CTI credibility evaluation dataset; and (3) achieving high-quality verification without extensive manual annotation. Experiments demonstrate that KGV significantly enhances LLM performance on CTI quality assessment, achieving an accuracy of XXX while substantially reducing annotation costs.

Technology Category

Application Category

๐Ÿ“ Abstract
Cyber threat intelligence is a critical tool that many organizations and individuals use to protect themselves from sophisticated, organized, persistent, and weaponized cyber attacks. However, few studies have focused on the quality assessment of threat intelligence provided by intelligence platforms, and this work still requires manual analysis by cybersecurity experts. In this paper, we propose a knowledge graph-based verifier, a novel Cyber Threat Intelligence (CTI) quality assessment framework that combines knowledge graphs and Large Language Models (LLMs). Our approach introduces LLMs to automatically extract OSCTI key claims to be verified and utilizes a knowledge graph consisting of paragraphs for fact-checking. This method differs from the traditional way of constructing complex knowledge graphs with entities as nodes. By constructing knowledge graphs with paragraphs as nodes and semantic similarity as edges, it effectively enhances the semantic understanding ability of the model and simplifies labeling requirements. Additionally, to fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources. To the best of our knowledge, this work is the first to create a dataset on threat intelligence reliability verification, providing a reference for future research. Experimental results show that KGV (Knowledge Graph Verifier) significantly improves the performance of LLMs in intelligence quality assessment. Compared with traditional methods, we reduce a large amount of data annotation while the model still exhibits strong reasoning capabilities. Finally, our method can achieve XXX accuracy in network threat assessment.
Problem

Research questions and friction points this paper is trying to address.

Automating cyber threat intelligence credibility assessment
Integrating LLMs with knowledge graphs for CTI analysis
Reducing manual expert effort in CTI credibility evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates LLMs with semantic knowledge graphs
Uses paragraph-level nodes for enhanced understanding
Reduces node quantity and boosts precision
๐Ÿ”Ž Similar Papers
No similar papers found.
Z
Zongzong Wu
School of Computer Science and Engineering, Central South University, Changsha, China
Fengxiao Tang
Fengxiao Tang
tohoku university, central south university
deep learningwireless network
M
Ming Zhao
School of Computer Science and Engineering, Central South University, Changsha, China
Yufeng Li
Yufeng Li
East China Normal University
Artificial Intelligence