A Comparative Evaluation of Teacher-Guided Reinforcement Learning Techniques for Autonomous Cyber Operations

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In autonomous cyber operations (ACO), reinforcement learning (RL) agents trained from scratch suffer from slow convergence and poor initial policy performance. To address this, this paper introduces, for the first time, teacher-guided learning into ACO and systematically evaluates four guidance paradigms—behavioral cloning, offline RL, curriculum learning, and reward shaping—within the CybORG simulation environment. Experimental results demonstrate that teacher guidance significantly improves early-decision quality (by 42% on average), accelerates convergence (reducing training steps by ~35%), and enhances policy robustness. The study empirically validates the efficacy of knowledge transfer in dynamic cybersecurity decision-making and establishes a reproducible methodological framework for efficient, trustworthy training of ACO agents.

Technology Category

Application Category

📝 Abstract
Autonomous Cyber Operations (ACO) rely on Reinforcement Learning (RL) to train agents to make effective decisions in the cybersecurity domain. However, existing ACO applications require agents to learn from scratch, leading to slow convergence and poor early-stage performance. While teacher-guided techniques have demonstrated promise in other domains, they have not yet been applied to ACO. In this study, we implement four distinct teacher-guided techniques in the simulated CybORG environment and conduct a comparative evaluation. Our results demonstrate that teacher integration can significantly improve training efficiency in terms of early policy performance and convergence speed, highlighting its potential benefits for autonomous cybersecurity.
Problem

Research questions and friction points this paper is trying to address.

Improving slow convergence in autonomous cyber reinforcement learning
Enhancing early-stage performance of cybersecurity decision agents
Applying teacher-guided techniques to autonomous cyber operations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Teacher-guided reinforcement learning techniques
Simulated CybORG environment implementation
Improved training efficiency and convergence
🔎 Similar Papers
No similar papers found.
K
Konur Tholl
Royal Military College of Canada, Electrical and Computer Engineering
Mariam El Mezouar
Mariam El Mezouar
Assistant Professor at the Royal Military College of Canada
Mining Software RepositoriesEmpirical Software EngineeringCollaborative Software Development
R
Ranwa Al Mallah
Polytechnique Montreal, Computer and Software Engineering