DeepStage: Learning Autonomous Defense Policies Against Multi-Stage APT Campaigns

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a deep reinforcement learning–based defense framework to address the lack of adaptive, stage-aware autonomous defense against multi-stage advanced persistent threats (APTs). The framework integrates host provenance and network telemetry into a unified provenance graph, leveraging graph neural networks and LSTM to estimate the attacker’s current stage within the MITRE ATT&CK framework. This stage estimation drives hierarchical Proximal Policy Optimization (PPO) agents to execute responsive actions—including monitoring, access control, isolation, and remediation. As the first stage-aware autonomous defense mechanism aligned with ATT&CK, the approach achieves a stage-weighted F1-score of 0.89 in realistic enterprise environments simulated using CALDERA, representing a 21.9% improvement over a risk-aware deep reinforcement learning baseline.

Technology Category

Application Category

📝 Abstract
This paper presents DeepStage, a deep reinforcement learning (DRL) framework for adaptive, stage-aware defense against Advanced Persistent Threats (APTs). The enterprise environment is modeled as a partially observable Markov decision process (POMDP), where host provenance and network telemetry are fused into unified provenance graphs. Building on our prior work, StageFinder, a graph neural encoder and an LSTM-based stage estimator infer probabilistic attacker stages aligned with the MITRE ATT&CK framework. These stage beliefs, combined with graph embeddings, guide a hierarchical Proximal Policy Optimization (PPO) agent that selects defense actions across monitoring, access control, containment, and remediation. Evaluated in a realistic enterprise testbed using CALDERA-driven APT playbooks, DeepStage achieves a stage-weighted F1-score of 0.89, outperforming a risk-aware DRL baseline by 21.9%. The results demonstrate effective stage-aware and cost-efficient autonomous cyber defense.
Problem

Research questions and friction points this paper is trying to address.

Advanced Persistent Threats
multi-stage attacks
autonomous cyber defense
stage-aware defense
cybersecurity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Reinforcement Learning
Stage-Aware Defense
Provenance Graph
Hierarchical PPO
APT Campaigns
🔎 Similar Papers
No similar papers found.
Trung V. Phan
Trung V. Phan
Assistant Professor, Claremont Colleges (Pitzer & Scripps)
biophysicsrobophysicscondensed mattercancer chemotherapymachine learning
T
Tri Gia Nguyen
Faculty of Information Technology, FPT University, Da Nang 50509, Vietnam
T
Thomas Bauschert
Chair of Communication Networks, Technische Universität Chemnitz, 09126 Chemnitz, Germany