SynAT: Enhancing Security Knowledge Bases via Automatic Synthesizing Attack Tree from Crowd Discussions

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of delayed updates in existing security knowledge bases, which hinder timely responses to emerging cyber threats. To bridge this gap, the authors propose SynAT, a novel end-to-end method that automatically constructs attack trees from unstructured, crowdsourced security discussions. SynAT first employs a large language model with prompt-based learning to identify sentences containing attack-related information, then leverages a transition-based joint event and relation extraction model to capture key elements, and finally applies heuristic rules to synthesize attack trees. Evaluated on 5,070 Stack Overflow posts, SynAT outperforms baseline approaches in both event/relation extraction and attack tree similarity. The method has been successfully applied to enhance Huawei’s internal knowledge base as well as public repositories such as CVE and CAPEC, effectively reducing the timeliness gap between community-driven insights and formal security knowledge bases.

Technology Category

Application Category

📝 Abstract
Cyber attacks have become a serious threat to the security of software systems. Many organizations have built their security knowledge bases to safeguard against attacks and vulnerabilities. However, due to the time lag in the official release of security information, these security knowledge bases may not be well maintained, and using them to protect software systems against emergent security risks can be challenging. On the other hand, the security posts on online knowledge-sharing platforms contain many crowd security discussions and the knowledge in those posts can be used to enhance the security knowledge bases. This paper proposes SynAT, an automatic approach to synthesize attack trees from crowd security posts. Given a security post, SynAT first utilize the Large Language Model (LLM) and prompt learning to restrict the scope of sentences that may contain attack information; then it utilizes a transition-based event and relation extraction model to extract the events and relations simultaneously from the scope; finally, it applies heuristic rules to synthesize the attack trees with the extracted events and relations. An experimental evaluation is conducted on 5,070 Stack Overflow security posts, and the results show that SynAT outperforms all baselines in both event and relation extraction, and achieves the highest tree similarity in attack tree synthesis. Furthermore, SynAT has been applied to enhance HUAWEI's security knowledge base as well as public security knowledge bases CVE and CAPEC, which demonstrates SynAT's practicality.
Problem

Research questions and friction points this paper is trying to address.

security knowledge base
attack tree
crowd discussions
cyber attacks
knowledge enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Attack Tree Synthesis
Large Language Model
Event and Relation Extraction
Crowdsourced Security Knowledge
Security Knowledge Base Enhancement
🔎 Similar Papers
No similar papers found.
Ziyou Jiang
Ziyou Jiang
Institute of Software Chinese Academy of Sciences
software engineering
Lin Shi
Lin Shi
Beihang University
Software Engineering
Guowei Yang
Guowei Yang
The University of Queensland
Software engineeringProgram analysisMobile softwareAI4SESE4AI
X
Xuyan Ma
State Key Laboratory of Intelligent Game, China, Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, China, and University of Chinese Academy of Sciences, China
F
Fenglong Li
Huawei Cloud Computing Technologies CO., LTD., China
Qing Wang
Qing Wang
Institute of Software Chinese Academy of Sciences
Software engineering