🤖 AI Summary
This work addresses the challenge of ensuring reliability for large language model (LLM) agents in production environments, where asynchronous execution and frequent failures complicate dependable operation. The authors propose LogAct, a novel abstraction that models agents as shared-log-based state machines. LogAct introduces pre-execution action visibility and a pluggable external voting mechanism to enable safe interception of harmful behaviors. It further leverages LLM self-reflective reasoning to support semantic-level consistent recovery and performance optimization. By decoupling state management and integrating health monitoring, the architecture achieves complete prevention of harmful actions with only a 3% utility loss on representative benchmarks, while simultaneously enabling efficient fault recovery and optimizing collective token consumption.
📝 Abstract
Agents are LLM-driven components that can mutate environments in powerful, arbitrary ways. Extracting guarantees for the execution of agents in production environments can be challenging due to asynchrony and failures. In this paper, we propose a new abstraction called LogAct, where each agent is a deconstructed state machine playing a shared log. In LogAct, agentic actions are visible in the shared log before they are executed; can be stopped prior to execution by pluggable, decoupled voters; and recovered consistently in the case of agent or environment failure. LogAct enables agentic introspection, allowing the agent to analyze its own execution history using LLM inference, which in turn enables semantic variants of recovery, health check, and optimization. In our evaluation, LogAct agents recover efficiently and correctly from failures; debug their own performance; optimize token usage in swarms; and stop all unwanted actions for a target model on a representative benchmark with just a 3% drop in benign utility.