GRID: Graph Representation of Intelligence Data for Security Text Knowledge Graph Construction

πŸ“… 2026-05-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

188K/year
πŸ€– AI Summary
This work addresses the challenges of automatically constructing security knowledge graphs from lengthy cyber threat intelligence (CTI) texts, where large language models (LLMs) often lack sufficient domain expertise and suffer from unstable, costly end-to-end training due to sparse supervision signals. To overcome these limitations, the authors propose the GRID framework, which leverages traceable document–graph alignment data to reformulate graph extraction as a scripted task library comprising multiple-choice questions and triplet-based regular matching. Crucially, GRID introduces an offline, reusable task-library reward mechanism that replaces online LLM-based scoring, substantially enhancing training stability and efficiency. Evaluated on 249 CTI documents using the Qwen3-4B-Instruct-2507 model, GRID achieves 84.62% precision, 64.91% recall, and an average F1 score of 68.53%, yielding state-of-the-art recall, near-optimal F1 performance, and significantly reduced token consumption.
πŸ“ Abstract
Security knowledge graphs can provide computable external memory for security agents, but constructing them from long-form cyber threat intelligence (CTI) remains difficult: LLMs often lack grounded security-domain knowledge, and end-to-end document-to-graph training is hard to supervise with cheap, stable rewards. We present GRID (Graph Representation of Intelligence Data), an end-to-end framework for security text knowledge graph construction. GRID first builds security-domain supervision from CTI articles by creating traceable article-graph alignments through graph extraction and knowledge-graph-conditioned text revision. It then turns document-to-graph learning into a scripted task bank combining four-option multi-select questions with triple-level regex matching targets, yielding more stable task-specific rewards than repeatedly scoring full graph outputs with an LLM judge. Using this supervision pipeline, we train two Qwen3-4B-Instruct-2507-based 4B extractors: a primary Task-bank Reward model and a secondary End2End Reward model with LLM-as-judge precision/recall rewards. On 249 CTI articles from GRID, CASIE, CTINexus, MalKG, and SecureNLP, the Task-bank Reward model with the ontology-guided GRID extraction pipeline reaches 84.62% source-averaged precision, 64.91% source-averaged recall, and 68.53% Avg F1, achieving the best source-averaged recall and near-top Avg F1 with lower token usage and deployment cost. The End2End Reward model reaches 76.91% precision, 53.85% recall, and 58.06% Avg F1. Further analyses show that task-bank rewards can be built once offline and reused across later post-training runs, outperforming online End2End LLM-as-judge reward and weaker alternatives such as Choice-only Reward and End2End SFT without RL.
Problem

Research questions and friction points this paper is trying to address.

security knowledge graph
cyber threat intelligence
document-to-graph construction
supervision signal
LLM grounding
Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge graph construction
cyber threat intelligence
task-bank reward
LLM-based extraction
graph-text alignment