🤖 AI Summary
This study addresses the challenge faced by resource-constrained organizations in translating cybersecurity governance frameworks—such as the NIST Cybersecurity Framework (CSF)—into actionable defense decisions. The authors propose an integrated optimization approach that synergistically combines governance requirements, adversary knowledge from MITRE ATT&CK, and adversarially aware learning. By establishing a mapping between NIST CSF maturity levels and ATT&CK mitigations, modeling attack paths via a variable-order Markov model, and formulating a deep reinforcement learning framework, the method generates cost-effective and risk-balanced defense strategies under budget constraints. This work represents the first effort to cohesively unify governance frameworks, ATT&CK-based threat intelligence, and interpretable reinforcement learning, yielding operationally viable mitigation plans that align with an organization’s security maturity and exhibit robustness against adaptive adversaries.
📝 Abstract
We address a fundamental challenge in cybersecurity operations of translating governance frameworks into actionable mitigation decisions under realistic resource constraints. Frameworks such as the NIST Cybersecurity Framework (CSF) provide widely adopted measures of organizational maturity, but do not directly support the selection and prioritization of defensive strategies against adversarial behavior. We present a system that operationalizes governance frameworks by mapping CSF maturity assessments into MITRE ATT\&CK mitigation capabilities, which enables direct integration of organizational security posture with adversary-informed defensive planning.
To manage adversary complexity, we employ a Variable-Order Markov Model (VOMM) trained on observed ATT\&CK technique sequences to enable scalable adversary simulation within a Deep Reinforcement Learning (DRL) environment. We reconstruct likely attack paths and defensive responses using beam search, and then jointly optimize mitigation selection under explicit budget constraints.
Our environment supports concurrent adversaries and realistic mitigation costs. Across multiple reward formulations and configurations, we show that the approach produces stable policies, meaningful cost-risk trade-offs, and interpretable mitigation plans aligned with organizational maturity. These results demonstrate that adversary-aware DRL can generate practical, resource-constrained defense strategies grounded in real-world frameworks and threat behavior.