PocketAgents: A Manifest-Driven Library of Autonomous Defense Agents

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work proposes a checklist-driven autonomous defense agent architecture to safely and reliably integrate large language models (LLMs) into active defense systems. The approach constrains LLM behavior through checklists, prompts, and runtime context, enabling interaction with the system via typed reports within a restricted environment. By introducing typed boundaries and checklist mechanisms, the method ensures that LLM outputs are verifiable, traceable, and bounded in their potential impact, thereby achieving measurable, scalable, and attributable defensive actions. In 18 closed-loop experiments conducted on the Perry adversarial testing platform, the system successfully generated 13 verified blocking actions, rejected 4 due to validation failures, and correctly determined that no response was needed in 1 case.

📝 Abstract

Connecting large language models (LLMs) to defensive enforcement requires more than asking a model whether an attack is happening. A defender must decide which model outputs may change the system state, which outputs must be rejected, and how failures should be recorded. We present PocketAgents, a manifest-driven library of autonomous defense agents. Each agent is installed as three data files: a manifest, a prompt, and a runtime context. The shared runtime gives the agent bounded telemetry access and accepts only typed reports whose requested action appears in the manifest. We implemented PocketAgents on top of a cyber arena (Perry), a cyber-deception testbed, and evaluated two agents, Command and Control and Exfiltration, in 18 closed-loop trials of a DarkSide-inspired attack on a small enterprise topology. Thirteen trials produced validated network-block actions and contained the attack; four failed schema validation; one produced a valid no-action decision. The experiments show that a typed boundary makes LLM-driven defense measurable, extensible, and attributable.

Problem

Research questions and friction points this paper is trying to address.

autonomous defense

large language models

cybersecurity

manifest-driven

typed boundary

Innovation

Methods, ideas, or system contributions that make the work stand out.

manifest-driven

autonomous defense agents

typed boundary