AutoPentester: An LLM Agent-based Framework for Automated Pentesting

📅 2025-10-07

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Manual penetration testing struggles to scale against escalating cyber threats, while existing automation tools (e.g., PentestGPT) remain heavily reliant on expert intervention. Method: This paper proposes a fully autonomous penetration testing framework powered by a large language model (LLM)-based agent. Given only a target IP address, the agent dynamically generates reconnaissance strategies and autonomously orchestrates security tools—including Nmap and Metasploit—to perform vulnerability identification, exploit chain construction, and iterative feedback-driven refinement. Contribution/Results: Its core innovation lies in an LLM agent architecture integrating tool orchestration, structured output parsing, and closed-loop control to emulate human-like penetration reasoning end-to-end. Evaluated on Hack The Box and custom lab environments, the framework achieves a 27.0% improvement in subtask completion rate, a 39.5% increase in vulnerability coverage, and attains a user satisfaction score of 3.93/5—19.8% higher than PentestGPT.

Technology Category

Application Category

📝 Abstract

Penetration testing and vulnerability assessment are essential industry practices for safeguarding computer systems. As cyber threats grow in scale and complexity, the demand for pentesting has surged, surpassing the capacity of human professionals to meet it effectively. With advances in AI, particularly Large Language Models (LLMs), there have been attempts to automate the pentesting process. However, existing tools such as PentestGPT are still semi-manual, requiring significant professional human interaction to conduct pentests. To this end, we propose a novel LLM agent-based framework, AutoPentester, which automates the pentesting process. Given a target IP, AutoPentester automatically conducts pentesting steps using common security tools in an iterative process. It can dynamically generate attack strategies based on the tool outputs from the previous iteration, mimicking the human pentester approach. We evaluate AutoPentester using Hack The Box and custom-made VMs, comparing the results with the state-of-the-art PentestGPT. Results show that AutoPentester achieves a 27.0% better subtask completion rate and 39.5% more vulnerability coverage with fewer steps. Most importantly, it requires significantly fewer human interactions and interventions compared to PentestGPT. Furthermore, we recruit a group of security industry professional volunteers for a user survey and perform a qualitative analysis to evaluate AutoPentester against industry practices and compare it with PentestGPT. On average, AutoPentester received a score of 3.93 out of 5 based on user reviews, which was 19.8% higher than PentestGPT.

Problem

Research questions and friction points this paper is trying to address.

Automating penetration testing to reduce human effort

Generating dynamic attack strategies using LLM agents

Improving vulnerability coverage with automated iterative processes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automates penetration testing using LLM agents

Dynamically generates attack strategies iteratively

Reduces human intervention with improved vulnerability coverage

🔎 Similar Papers

No similar papers found.