🤖 AI Summary
Traditional penetration testing relies heavily on manual effort, resulting in low efficiency, poor scalability, and inadequate responsiveness to the security demands of complex systems. This paper presents a systematic review of 58 AI-augmented penetration testing studies, spanning reconnaissance, exploitation, and post-exploitation phases, with focused analysis on automated vulnerability discovery, attack-path modeling, and network topology inference. We quantitatively identify reinforcement learning as the dominant AI paradigm (77% of surveyed works), uncover a critical gap in large language model adoption, and reveal inflexibility bottlenecks in reconnaissance and post-exploitation models. Our findings directly inform the development and deployment of practical tools—including ESA PenBox—demonstrating substantial reductions in analyst workload, improved vulnerability detection throughput, and enhanced precision in attack-chain analysis.
📝 Abstract
Penetration testing is a cornerstone of cybersecurity, traditionally driven by manual, time-intensive processes. As systems grow in complexity, there is a pressing need for more scalable and efficient testing methodologies. This systematic literature review examines how Artificial Intelligence (AI) is reshaping penetration testing, analyzing 58 peer-reviewed studies from major academic databases. Our findings reveal that while AI-assisted pentesting is still in its early stages, notable progress is underway, particularly through Reinforcement Learning (RL), which was the focus of 77% of the reviewed works. Most research centers on the discovery and exploitation phases of pentesting, where AI shows the greatest promise in automating repetitive tasks, optimizing attack strategies, and improving vulnerability identification. Real-world applications remain limited but encouraging, including the European Space Agency's PenBox and various open-source tools. These demonstrate AI's potential to streamline attack path analysis, analyze complex network topology, and reduce manual workload. However, challenges persist: current models often lack flexibility and are underdeveloped for the reconnaissance and post-exploitation phases of pentesting. Applications involving Large Language Models (LLMs) remain relatively under-researched, pointing to a promising direction for future exploration. This paper offers a critical overview of AI's current and potential role in penetration testing, providing valuable insights for researchers, practitioners, and organizations aiming to enhance security assessments through advanced automation or looking for gaps in existing research.