DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

Existing AI agents are hindered in high-frequency state exploration by the high latency—ranging from hundreds of milliseconds to seconds—of full sandbox checkpoint/rollback (C/R) mechanisms, limiting their ability to perform deep search and scale effectively. This work proposes DeltaBox, which introduces DeltaState, an operating system–level abstraction that records only the differential state between consecutive checkpoints, enabling a change-based transactional C/R mechanism. Its core innovations include the co-designed DeltaFS and DeltaCR: DeltaFS employs a layered copy-on-write file system to support efficient file-state C/R, while DeltaCR accelerates rollback through incremental process snapshots combined with direct forking of frozen template processes. Experiments on SWE-bench and reinforcement learning microbenchmarks demonstrate checkpoint and rollback latencies reduced to 14 ms and 5 ms, respectively, substantially enhancing agents’ state exploration capacity within fixed time budgets.

📝 Abstract

LLM-powered AI agents require high-frequency state exploration (e.g., test-time tree search and reinforcement learning), relying on rapid checkpoint and rollback (C/R) of the complete sandbox state, including files and process state (e.g., memory, contexts, etc.). Existing mechanisms duplicate the entire state, causing hundreds of milliseconds to seconds of latency per C/R, which severely bottlenecks deep search and large-scale fan-outs. This paper observes that subsequent checkpoints in AI agents are highly similar. Therefore, instead of full duplication, a sandbox should only duplicate the changes between consecutive checkpoints (Key Insight). However, it is non-trivial to realize the idea, mainly due to the missing OS supports. This paper proposes a new OS-level abstraction, DeltaState, to enable the change-based transactional C/R for AI agents with two co-designed OS mechanisms. First, DeltaFS enables change-based filesystem C/R by organizing the file states into layers and dynamically freezing the writable layer and inserting a new one during checkpoint, reducing file updates to copy-on-write, and making rollback a simple layer switch. Second, DeltaCR enables change-based process state C/R using incremental dumps, and accelerates rollback by bypassing traditional pipelines to directly fork() from a frozen template process. We then present DeltaBox, a novel agent sandbox achieving millisecond level C/R through the two new mechanisms. Evaluations on SWE-bench and RL micro-benchmarks show DeltaBox completes checkpoint and rollback in millisecond-level latency (14ms and 5ms, respectively), empowering agents to explore substantially more nodes under fixed time budgets.

Problem

Research questions and friction points this paper is trying to address.

checkpoint/rollback

stateful AI agents

sandbox

latency bottleneck

high-frequency state exploration

Innovation

Methods, ideas, or system contributions that make the work stand out.

DeltaState

checkpoint/rollback

change-based C/R