🤖 AI Summary
This work systematically investigates the threat posed by large language model (LLM)-driven adversarial attacks against text-based Cyber Threat Intelligence (CTI) systems. Addressing the full CTI pipeline—intelligence collection, Indicator of Compromise (IoC) extraction, and classification decision-making—we design and implement three generative adversarial attack types: evasion, flooding, and poisoning. Leveraging LLMs, we generate semantically plausible and contextually coherent malicious threat texts. We present the first empirical evidence of differential vulnerability across CTI components under generative adversarial inputs, revealing that evasion constitutes a critical prerequisite for successful flooding and poisoning. Experiments demonstrate that our attacks significantly degrade IoC extraction model accuracy, induce high false-positive rates, and disrupt core system functionality—thereby validating the practical security risks of LLM-based CTI systems in real-world deployment.
📝 Abstract
Cyber Threat Intelligence (CTI) has emerged as a vital complementary approach that operates in the early phases of the cyber threat lifecycle. CTI involves collecting, processing, and analyzing threat data to provide a more accurate and rapid understanding of cyber threats. Due to the large volume of data, automation through Machine Learning (ML) and Natural Language Processing (NLP) models is essential for effective CTI extraction. These automated systems leverage Open Source Intelligence (OSINT) from sources like social networks, forums, and blogs to identify Indicators of Compromise (IoCs). Although prior research has focused on adversarial attacks on specific ML models, this study expands the scope by investigating vulnerabilities within various components of the entire CTI pipeline and their susceptibility to adversarial attacks. These vulnerabilities arise because they ingest textual inputs from various open sources, including real and potentially fake content. We analyse three types of attacks against CTI pipelines, including evasion, flooding, and poisoning, and assess their impact on the system's information selection capabilities. Specifically, on fake text generation, the work demonstrates how adversarial text generation techniques can create fake cybersecurity and cybersecurity-like text that misleads classifiers, degrades performance, and disrupts system functionality. The focus is primarily on the evasion attack, as it precedes and enables flooding and poisoning attacks within the CTI pipeline.