DART: Semantic Recoverability for Structured Tool Agents

📅 2026-05-22

📈 Citations: 0

✨ Influential: 0

career value

131K/year

🤖 AI Summary

This work addresses the challenge of determining whether a local recovery point is semantically valid when structured tool-using agents fail mid-execution, particularly in scenarios where downstream components have already committed to outputs from upstream stages. The paper introduces DART, a runtime system that formalizes the notion of “semantic recoverability” for the first time. DART enables safe and efficient partial recovery by identifying failure instances, verifying semantic boundaries, aligning checkpoints, and selecting legitimate recovery points under dependency and effect constraints. Its modular architecture incorporates explicit acceptability checks to prevent invalidation of already-committed downstream work. Empirical evaluation across three LLM-driven tasks and the LangGraph framework demonstrates that DART successfully recovers all commitment-sensitive cases where baseline methods fail, with no unsafe rollbacks detected in a five-domain safety audit.

📝 Abstract

When a structured tool agent fails mid-execution, the runtime faces a dilemma: replaying the entire task is safe but wasteful, while restoring from a local checkpoint is efficient but can leave committed downstream work tied to an upstream history that no longer exists. This tension is acute in commitment-sensitive settings, where rollback targets a single failed instance yet downstream consumers have already acted on its output. Existing recovery approaches provide mechanical rollback but no criterion for whether a local restore remains semantically valid after downstream commitment. We formalize this gap as semantic recoverability and address it in DART, a modular runtime that localizes the failed instance, certifies semantically recoverable boundaries of that instance, aligns checkpoints to those boundaries, and selects an admissible restore point that preserves committed downstream work under dependency and effect constraints-or blocks otherwise. Across three LLM-driven domains and external validation on a LangGraph-based substrate, DART correctly recovers all evaluated commitment-sensitive cases where baseline local recovery fails, and a five-domain safety audit finds no unsafe admitted rollbacks. These results show that controller legality does not imply semantic validity, and that sound local recovery requires an explicit admissibility check.

Problem

Research questions and friction points this paper is trying to address.

semantic recoverability

structured tool agents

commitment-sensitive

local recovery

runtime failure

Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic recoverability

structured tool agents

local checkpoint recovery