🤖 AI Summary
In multi-tenant cloud environments, tool invocations by AI agents induce severe and unpredictable fluctuations in OS-level resources, which existing container-based resource control mechanisms struggle to mitigate effectively due to their coarse granularity and slow response. This work is the first to characterize the memory spike patterns driven by tool calls and introduces a novel resource control paradigm operating at the granularity of individual tool invocations. To realize this approach, we design AgentCgroup, a kernel-level controller leveraging eBPF that enforces precise, low-overhead resource isolation through a hierarchical cgroup structure aligned with tool-call boundaries, runtime-adaptive policies, and advanced mechanisms such as sched_ext and memcg_bpf_ops. Preliminary evaluation demonstrates that AgentCgroup substantially enhances multi-tenant isolation while reducing resource wastage.
📝 Abstract
AI agents are increasingly deployed in multi-tenant cloud environments, where they execute diverse tool calls within sandboxed containers, each call with distinct resource demands and rapid fluctuations. We present a systematic characterization of OS-level resource dynamics in sandboxed AI coding agents, analyzing 144 software engineering tasks from the SWE-rebench benchmark across two LLM models. Our measurements reveal that (1) OS-level execution (tool calls, container and agent initialization) accounts for 56-74% of end-to-end task latency; (2) memory, not CPU, is the concurrency bottleneck; (3) memory spikes are tool-call-driven with a up to 15.4x peak-to-average ratio; and (4) resource demands are highly unpredictable across tasks, runs, and models. Comparing these characteristics against serverless, microservice, and batch workloads, we identify three mismatches in existing resource controls: a granularity mismatch (container-level policies vs. tool-call-level dynamics), a responsiveness mismatch (user-space reaction vs. sub-second unpredictable bursts), and an adaptability mismatch (history-based prediction vs. non-deterministic stateful execution). We propose AgentCgroup , an eBPF-based resource controller that addresses these mismatches through hierarchical cgroup structures aligned with tool-call boundaries, in-kernel enforcement via sched_ext and memcg_bpf_ops, and runtime-adaptive policies driven by in-kernel monitoring. Preliminary evaluation demonstrates improved multi-tenant isolation and reduced resource waste.