Learning the APT Kill Chain: Temporal Reasoning over Provenance Data for Attack Stage Estimation

๐Ÿ“… 2026-03-08
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of accurately identifying the current stage of Advanced Persistent Threat (APT) attacks, which exhibit multi-stage evolutionary characteristics, a capability critical for enabling adaptive defense mechanisms. To this end, the paper proposes StageFinder, a novel framework that, for the first time, integrates host- and network-level provenance data into a temporal graph. StageFinder leverages Graph Neural Networks (GNNs) to model structural dependencies among entities and incorporates Long Short-Term Memory (LSTM) networks to capture the temporal dynamics of attack behaviors, thereby enabling precise inference of attack stages aligned with the MITRE ATT&CK framework. Evaluated on the DARPA dataset, StageFinder achieves a macro F1-score of 0.96, reducing prediction volatility by 31% compared to baseline methods such as Cyberian and NetGuardian, and significantly enhancing both inference accuracy and stability.

Technology Category

Application Category

๐Ÿ“ Abstract
Advanced Persistent Threats (APTs) evolve through multiple stages, each exhibiting distinct temporal and structural behaviors. Accurate stage estimation is critical for enabling adaptive cyber defense. This paper presents StageFinder, a temporal graph learning framework for multi-stage attack progression inference from fused host and network provenance data. Provenance graphs are encoded using a graph neural network to capture structural dependencies among processes, files, and connections, while a long short-term memory (LSTM) model learns temporal dynamics to estimate stage probabilities aligned with the MITRE ATT&CK framework. The model is pretrained on the DARPA OpTC dataset and fine-tuned on labeled DARPA Transparent Computing data. Experimental results demonstrate that StageFinder achieves a macro F1-score of 0.96 and reduces prediction volatility by 31 percent compared to state-of-the-art baselines (Cyberian, NetGuardian). These results highlight the effectiveness of fused provenance and temporal learning for accurate and stable APT stage inference.
Problem

Research questions and friction points this paper is trying to address.

Advanced Persistent Threats
Attack Stage Estimation
Provenance Data
Temporal Reasoning
Cyber Defense
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal Graph Learning
Provenance Data Fusion
APT Stage Estimation
Graph Neural Network
LSTM
๐Ÿ”Ž Similar Papers
No similar papers found.
Trung V. Phan
Trung V. Phan
Assistant Professor, Claremont Colleges (Pitzer & Scripps)
biophysicsrobophysicscondensed mattercancer chemotherapymachine learning
T
Thomas Bauschert
Chair of Communication Networks, Technische Universitรคt Chemnitz, 09126 Chemnitz, Germany