🤖 AI Summary
This study addresses the critical limitation of existing cybersecurity knowledge graphs, which rely heavily on structured data and thus struggle to incorporate rapidly evolving unstructured threat intelligence in a timely manner, leading to delayed risk insights. To overcome this, the authors propose TRACE, a novel framework that, for the first time, enables real-time fusion and alignment of 24 structured databases with three types of unstructured data—APT reports, academic papers, and vulnerability advisories—leveraging large language models for efficient entity extraction and dynamic knowledge updating. The approach significantly enhances the timeliness, coverage, and structural consistency of the resulting knowledge graph, achieving a 1.8× increase in node coverage and an entity extraction F1 score of 81.24%, outperforming the current best LLM-based baseline by 7.8%, thereby enabling real-time, comprehensive threat analysis.
📝 Abstract
The rapid evolution of cyber threats has highlighted significant gaps in security knowledge integration. Cybersecurity Knowledge Graphs (CKGs) relying on structured data inherently exhibit hysteresis, as the timely incorporation of rapidly evolving unstructured data remains limited, potentially leading to the omission of critical insights for risk analysis. To address these limitations, we introduce TRACE, a framework designed to integrate structured and unstructured cybersecurity data sources. TRACE integrates knowledge from 24 structured databases and 3 categories of unstructured data, including APT reports, papers, and repair notices. Leveraging Large Language Models (LLMs), TRACE facilitates efficient entity extraction and alignment, enabling continuous updates to the CKG. Evaluations demonstrate that TRACE achieves a 1.8x increase in node coverage compared to existing CKGs. TRACE attains the precision of 86.08%, the recall of 76.92%, and the F1 score of 81.24% in entity extraction, surpassing the best-known LLM-based baselines by 7.8%. Furthermore, our entity alignment methods effectively harmonize entities with existing knowledge structures, enhancing the integrity and utility of the CKG. With TRACE, threat hunters and attack analysts gain real-time, holistic insights into vulnerabilities, attack methods, and defense technologies.