Automated Security Response through Online Learning with Adaptive Conjectures

📅 2024-02-19
🏛️ arXiv.org
📈 Citations: 6
Influential: 0
📄 PDF
🤖 AI Summary
Automated security response in IT infrastructure faces challenges from dynamic attacker-defender interactions, rule uncertainty, and model misspecification. Method: This paper models advanced persistent threat (APT) scenarios as partially observable, non-stationary games and proposes the Conjecture Online Learning (COL) framework. COL jointly integrates Bayesian conjecture updating with rollout-based policy optimization, grounded in a variant of Berk-Nash equilibrium. It provides theoretical convergence guarantees and performance bounds under model misspecification. Efficient online learning is achieved via POMDP approximation. Contribution/Results: Experiments demonstrate that COL policies adapt effectively to environmental evolution, converge faster than state-of-the-art reinforcement learning methods, and yield conjectures that asymptotically approach the optimal model fit. The framework is validated on an APT testbed, confirming its effectiveness and robustness against realistic adversarial dynamics.

Technology Category

Application Category

📝 Abstract
We study automated security response for an IT infrastructure and formulate the interaction between an attacker and a defender as a partially observed, non-stationary game. We relax the standard assumption that the game model is correctly specified and consider that each player has a probabilistic conjecture about the model, which may be misspecified in the sense that the true model has probability 0. This formulation allows us to capture uncertainty and misconception about the infrastructure and the intents of the players. To learn effective game strategies online, we design Conjectural Online Learning (COL), a novel method where a player iteratively adapts its conjecture using Bayesian learning and updates its strategy through rollout. We prove that the conjectures converge to best fits, and we provide a bound on the performance improvement that rollout enables with a conjectured model. To characterize the steady state of the game, we propose a variant of the Berk-Nash equilibrium. We present COL through an advanced persistent threat use case. Testbed evaluations show that COL produces effective security strategies that adapt to a changing environment. We also find that COL enables faster convergence than current reinforcement learning techniques.
Problem

Research questions and friction points this paper is trying to address.

Adaptive Cybersecurity
Self-Learning Systems
Dynamic Game Theory
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conjecture Online Learning (COL)
Self-adjusting Guessing Mechanism
Berk-Nash Equilibrium Variation