APT-CGLP: Advanced Persistent Threat Hunting via Contrastive Graph-Language Pre-Training

📅 2025-11-25

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

To address the challenges of cross-modal heterogeneity between provenance graphs and Cyber Threat Intelligence (CTI) reports, information loss due to graph extraction, and heavy reliance on manual annotation, this paper proposes an end-to-end cross-modal semantic matching framework. Our method innovatively integrates large language model–driven synthesis of high-quality graph–text pairs, multi-objective contrastive graph–language pretraining, and cross-modal masked modeling to achieve attack semantic alignment at both coarse- and fine-grained levels. Crucially, it eliminates explicit graph structure extraction, thereby avoiding information loss and reducing human intervention. Extensive experiments on four real-world APT datasets demonstrate that our approach significantly outperforms state-of-the-art threat hunting baselines in both detection accuracy and inference efficiency.

Technology Category

Application Category

📝 Abstract

Provenance-based threat hunting identifies Advanced Persistent Threats (APTs) on endpoints by correlating attack patterns described in Cyber Threat Intelligence (CTI) with provenance graphs derived from system audit logs. A fundamental challenge in this paradigm lies in the modality gap--the structural and semantic disconnect between provenance graphs and CTI reports. Prior work addresses this by framing threat hunting as a graph matching task: 1) extracting attack graphs from CTI reports, and 2) aligning them with provenance graphs. However, this pipeline incurs severe extit{information loss} during graph extraction and demands intensive manual curation, undermining scalability and effectiveness. In this paper, we present APT-CGLP, a novel cross-modal APT hunting system via Contrastive Graph-Language Pre-training, facilitating end-to-end semantic matching between provenance graphs and CTI reports without human intervention. First, empowered by the Large Language Model (LLM), APT-CGLP mitigates data scarcity by synthesizing high-fidelity provenance graph-CTI report pairs, while simultaneously distilling actionable insights from noisy web-sourced CTIs to improve their operational utility. Second, APT-CGLP incorporates a tailored multi-objective training algorithm that synergizes contrastive learning with inter-modal masked modeling, promoting cross-modal attack semantic alignment at both coarse- and fine-grained levels. Extensive experiments on four real-world APT datasets demonstrate that APT-CGLP consistently outperforms state-of-the-art threat hunting baselines in terms of accuracy and efficiency.

Problem

Research questions and friction points this paper is trying to address.

Bridging the modality gap between provenance graphs and CTI reports for threat hunting

Eliminating information loss and manual curation in traditional graph matching approaches

Enabling end-to-end semantic matching between attack patterns and system logs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLM to synthesize graph-report training pairs

Combines contrastive learning with masked modeling

Enables end-to-end semantic matching without human intervention

🔎 Similar Papers

No similar papers found.