KryptoPilot: An Open-World Knowledge-Augmented LLM Agent for Automated Cryptographic Exploitation

📅 2026-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited performance of existing large language model (LLM) agents in high-difficulty cryptographic CTF challenges, which stems from insufficient cryptanalytic precision, inadequate long-horizon reasoning, and poor integration with specialized toolchains. To overcome these limitations, we propose an open-world knowledge-enhanced agent architecture that enables accurate knowledge alignment and stable, efficient reasoning through fine-grained, research-driven knowledge acquisition, a persistent structured workspace, cost-aware model routing, and behavior-constrained governance mechanisms. The system integrates a dedicated cryptographic toolchain and achieves a 100% solve rate on InterCode-CTF, resolves 56%–60% of challenges on the NYU-CTF benchmark, and successfully solves 26 out of 33 tasks across six real-world CTF competitions—including multiple first-blood and unique solutions.

Technology Category

Application Category

📝 Abstract
Capture-the-Flag (CTF) competitions play a central role in modern cybersecurity as a platform for training practitioners and evaluating offensive and defensive techniques derived from real-world vulnerabilities. Despite recent advances in large language models (LLMs), existing LLM-based agents remain ineffective on high-difficulty cryptographic CTF challenges, which require precise cryptanalytic knowledge, stable long-horizon reasoning, and disciplined interaction with specialized toolchains. Through a systematic exploratory study, we show that insufficient knowledge granularity, rather than model reasoning capacity, is a primary factor limiting successful cryptographic exploitation: coarse or abstracted external knowledge often fails to support correct attack modeling and implementation. Motivated by this observation, we propose KryptoPilot, an open-world knowledge-augmented LLM agent for automated cryptographic exploitation. KryptoPilot integrates dynamic open-world knowledge acquisition via a Deep Research pipeline, a persistent workspace for structured knowledge reuse, and a governance subsystem that stabilizes reasoning through behavioral constraints and cost-aware model routing. This design enables precise knowledge alignment while maintaining efficient reasoning across heterogeneous subtasks. We evaluate KryptoPilot on two established CTF benchmarks and in six real-world CTF competitions. KryptoPilot achieves a complete solve rate on InterCode-CTF, solves between 56 and 60 percent of cryptographic challenges on the NYU-CTF benchmark, and successfully solves 26 out of 33 cryptographic challenges in live competitions, including multiple earliest-solved and uniquely-solved instances. These results demonstrate the necessity of open-world, fine-grained knowledge augmentation and governed reasoning for scaling LLM-based agents to real-world cryptographic exploitation.
Problem

Research questions and friction points this paper is trying to address.

cryptographic exploitation
Capture-the-Flag
large language models
knowledge augmentation
open-world knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

open-world knowledge augmentation
cryptographic exploitation
LLM agent
governed reasoning
Deep Research pipeline
🔎 Similar Papers
No similar papers found.