๐ค AI Summary
Existing relational databases lack the persistent isolation and concurrent branching capabilities required by agent workloads, hindering support for speculative modifications and nonlinear state exploration. This work presents the first systematic definition and quantitative evaluation of database branching for intelligent agents, introducing BranchBenchโa benchmark combining parameterized macrobenchmarks (modeling branch-mutate-evaluate cycles) and microbenchmarks (measuring branch lifecycle overhead). We evaluate systems including Neon, DoltgreSQL, TigerBeetle, Xata, and PostgreSQL, revealing that fast-branching systems suffer 5โ4000ร read performance degradation as branch depth increases, while fast data operating systems exhibit 25โ1500ร higher latency in branch creation and switching. Our experiments demonstrate that none of the current systems can effectively scale to support representative agent workloads such as software engineering tasks, fault reproduction, or Monte Carlo Tree Search (MCTS).
๐ Abstract
Branchable databases are evolving from developer tools to infrastructure for agentic workloads characterized by speculative mutations and non-linear state exploration. Traditional RDBMS mechanisms such as nested transactions do not provide the persistent isolation and concurrent branch management required by autonomous agents, and recent "zero-copy" designs make different trade-offs whose impact on agentic workloads remains unclear.
To clarify this space, we present BranchBench, a benchmark for evaluating branchable relational DBMSes under agentic exploration. We characterize five representative workloads-agentic software engineering, failure reproduction, data curation, MCTS, and simulation-and design parameterized macrobenchmarks that execute branch-mutate-evaluate loops to reflect these workloads, along with microbenchmarks that isolate branch lifecycle costs. We evaluate state of the art systems including Neon, DoltgreSQL, Tiger Data, Xata, and PostgreSQL baselines, and find a fundamental tension: systems optimized for fast branching suffer up to 5-4000x slower reads as branches deepen, while systems optimized for fast data operations incur 25-1500x higher branch creation and switching latency. Further, no current system supports the representative workloads at scale. These results highlight the need for branch-native DBMSes designed specifically for agentic exploration.