Can LLMs Hack Enterprise Networks? -- Replicated Computational Results (RCR) Report

๐Ÿ“… 2026-03-02
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study systematically evaluates the practical capabilities of large language models (LLMs) in enterprise network penetration testing, with a focus on automated attack scenarios within simulated Microsoft Active Directory environments. By constructing a reproducible experimental framework that integrates multiple LLM interfaces and custom evaluation scripts, this work provides the first empirical validation of LLMsโ€™ feasibility in executing complex penetration tasks within realistic enterprise red-team simulations. The research successfully reproduces and extends prior findings, demonstrating that certain LLMs can effectively conduct penetration activities under specific conditions. Furthermore, the authors open-source the complete toolchain and experimental pipeline, significantly enhancing reproducibility and standardization in AI-driven security research.

Technology Category

Application Category

๐Ÿ“ Abstract
This is the Replicated Computational Results (RCR) Report for the paper ``Can LLMs Hack Enterprise Networks?" The paper empirically investigates the efficacy and effectiveness of different LLMs for penetration-testing enterprise networks, i.e., Microsoft Active Directory Assumed-Breach Simulations. This RCR report describes the artifacts used in the paper, how to create an evaluation setup, and highlights the analysis scripts provided within our prototype.
Problem

Research questions and friction points this paper is trying to address.

LLMs
penetration testing
enterprise networks
Active Directory
assumed-breach simulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
Penetration Testing
Active Directory
Assumed-Breach Simulation
Replicable Evaluation
๐Ÿ”Ž Similar Papers
No similar papers found.