AgentSPEX: An Agent SPecification and EXecution Language

📅 2026-04-14

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

Current large language model (LLM) agent systems lack explicit control flow and modular design, leading to behaviors that are difficult to control, debug, and reuse. This work proposes a declarative language tailored for LLM agent workflows, supporting typed steps, branching, looping, parallel execution, reusable submodules, and explicit state management. The approach is complemented by a sandboxed execution engine and a synchronized graph-code visual editor. For the first time, it unifies explicit control flow, modular architecture, and visual orchestration, decoupling workflow logic from underlying implementation and substantially enhancing agent interpretability and maintainability. Experiments across seven benchmark tasks and user studies demonstrate that the system is significantly more comprehensible and usable than mainstream frameworks, while also providing out-of-the-box, deeply capable research agents.

Technology Category

Application Category

📝 Abstract

Language-model agent systems commonly rely on reactive prompting, in which a single instruction guides the model through an open-ended sequence of reasoning and tool-use steps, leaving control flow and intermediate state implicit and making agent behavior potentially difficult to control. Orchestration frameworks such as LangGraph, DSPy, and CrewAI impose greater structure through explicit workflow definitions, but tightly couple workflow logic with Python, making agents difficult to maintain and modify. In this paper, we introduce AgentSPEX, an Agent SPecification and EXecution Language for specifying LLM-agent workflows with explicit control flow and modular structure, along with a customizable agent harness. AgentSPEX supports typed steps, branching and loops, parallel execution, reusable submodules, and explicit state management, and these workflows execute within an agent harness that provides tool access, a sandboxed virtual environment, and support for checkpointing, verification, and logging. Furthermore, we provide a visual editor with synchronized graph and workflow views for authoring and inspection. We include ready-to-use agents for deep research and scientific research, and we evaluate AgentSPEX on 7 benchmarks. Finally, we show through a user study that AgentSPEX provides a more interpretable and accessible workflow-authoring paradigm than a popular existing agent framework.

Problem

Research questions and friction points this paper is trying to address.

language-model agents

workflow specification

control flow

modular structure

agent maintainability

Innovation

Methods, ideas, or system contributions that make the work stand out.

AgentSPEX

workflow specification

explicit control flow