🤖 AI Summary
Existing multi-agent systems rely on unstructured natural language communication, resulting in ambiguous collaboration, poor fault tolerance, and limited cross-domain interoperability. This paper introduces Agent Context Protocols (ACPs)—a domain-agnostic, structured collaboration framework comprising: (1) persistent execution blueprints modeled as directed acyclic graphs (DAGs) to explicitly capture intermediate-result dependencies; (2) standardized message schemas and built-in fault-recovery mechanisms to ensure robust inter-agent communication; and (3) modular protocol extension interfaces enabling seamless integration of heterogeneous, domain-specialized agents. ACPs are the first framework to unify dependency modeling with protocol standardization, significantly enhancing collaborative reasoning reliability. Evaluated on AssistantBench’s long-horizon web tasks, ACP-based agents achieve 28.3% task accuracy. Moreover, the generated multimodal technical reports surpass leading commercial AI systems in quality, ranking first in human evaluation.
📝 Abstract
AI agents have become increasingly adept at complex tasks such as coding, reasoning, and multimodal understanding. However, building generalist systems requires moving beyond individual agents to collective inference -- a paradigm where multi-agent systems with diverse, task-specialized agents complement one another through structured communication and collaboration. Today, coordination is usually handled with imprecise, ad-hoc natural language, which limits complex interaction and hinders interoperability with domain-specific agents. We introduce Agent context protocols (ACPs): a domain- and agent-agnostic family of structured protocols for agent-agent communication, coordination, and error handling. ACPs combine (i) persistent execution blueprints -- explicit dependency graphs that store intermediate agent outputs -- with (ii) standardized message schemas, enabling robust and fault-tolerant multi-agent collective inference. ACP-powered generalist systems reach state-of-the-art performance: 28.3 % accuracy on AssistantBench for long-horizon web assistance and best-in-class multimodal technical reports, outperforming commercial AI systems in human evaluation. ACPs are highly modular and extensible, allowing practitioners to build top-tier generalist agents quickly.