Dynamic Risk Assessments for Offensive Cybersecurity Agents

📅 2025-05-23

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Existing model auditing frameworks largely overlook realistic adversaries’ iterative optimization capabilities under computational resource constraints, thus failing to capture the dynamic cybersecurity risks posed by autonomous programming agents. Method: We propose an extended threat model that— for the first time—unifies adversary degrees of freedom (stateful vs. stateless environments), iterative optimization capacity, and fixed compute budgets (e.g., 8 H100 GPU-hours) within a single risk assessment framework. Leveraging the InterCode CTF platform, we conduct automated red-teaming augmented with reinforcement-driven feedback loops. Contribution/Results: Empirical evaluation demonstrates that adversaries with only limited compute resources can boost attack agent performance on CTF tasks by over 40% relative to baselines. This work breaks from static auditing paradigms and establishes the first dynamic risk assessment methodology explicitly designed for compute-constrained, state-aware adversaries—significantly enhancing the fidelity and forward-looking capability of risk characterization.

Technology Category

Application Category

📝 Abstract

Foundation models are increasingly becoming better autonomous programmers, raising the prospect that they could also automate dangerous offensive cyber-operations. Current frontier model audits probe the cybersecurity risks of such agents, but most fail to account for the degrees of freedom available to adversaries in the real world. In particular, with strong verifiers and financial incentives, agents for offensive cybersecurity are amenable to iterative improvement by would-be adversaries. We argue that assessments should take into account an expanded threat model in the context of cybersecurity, emphasizing the varying degrees of freedom that an adversary may possess in stateful and non-stateful environments within a fixed compute budget. We show that even with a relatively small compute budget (8 H100 GPU Hours in our study), adversaries can improve an agent's cybersecurity capability on InterCode CTF by more than 40% relative to the baseline -- without any external assistance. These results highlight the need to evaluate agents' cybersecurity risk in a dynamic manner, painting a more representative picture of risk.

Problem

Research questions and friction points this paper is trying to address.

Assessing cybersecurity risks of autonomous offensive agents dynamically

Accounting for adversaries' freedom in stateful and non-stateful environments

Evaluating agent capability improvement under limited compute budgets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic risk assessments for cybersecurity agents

Iterative improvement with strong verifiers

Small compute budget boosts capability significantly

🔎 Similar Papers

No similar papers found.