🤖 AI Summary
To address core challenges in multi-agent systems—namely interoperability, collaborative efficiency, and knowledge sharing—this paper proposes Co-TAP, a three-layer agent interaction protocol. It comprises: (1) the Human-Agent Interaction (HAI) protocol, an event-driven interface for human–agent coordination; (2) the Unified Agent Protocol (UAP), enabling seamless interoperation among heterogeneous agents; and (3) the MEK cognitive chain, which formalizes the transformation of experiential data into structured knowledge. Co-TAP is the first framework to holistically standardize interaction, connectivity, and cognition, leveraging event-driven architecture, unified service discovery, cross-protocol translation, and standardized memory extraction. These mechanisms collectively support dynamic cross-platform agent onboarding, real-time communication, and collaborative learning. Empirical evaluation demonstrates significant improvements in interaction reliability and knowledge reuse efficiency. Co-TAP thus establishes a scalable engineering paradigm and foundational theoretical framework for distributed artificial intelligence.
📝 Abstract
This paper proposes Co-TAP (T: Triple, A: Agent, P: Protocol), a three-layer agent interaction protocol designed to address the challenges faced by multi-agent systems across the three core dimensions of Interoperability, Interaction and Collaboration, and Knowledge Sharing. We have designed and proposed a layered solution composed of three core protocols: the Human-Agent Interaction Protocol (HAI), the Unified Agent Protocol (UAP), and the Memory-Extraction-Knowledge Protocol (MEK). HAI focuses on the interaction layer, standardizing the flow of information between users, interfaces, and agents by defining a standardized, event-driven communication paradigm. This ensures the real-time performance, reliability, and synergy of interactions. As the core of the infrastructure layer, UAP is designed to break down communication barriers among heterogeneous agents through unified service discovery and protocol conversion mechanisms, thereby enabling seamless interconnection and interoperability of the underlying network. MEK, in turn, operates at the cognitive layer. By establishing a standardized ''Memory (M) - Extraction (E) - Knowledge (K)'' cognitive chain, it empowers agents with the ability to learn from individual experiences and form shareable knowledge, thereby laying the foundation for the realization of true collective intelligence. We believe this protocol framework will provide a solid engineering foundation and theoretical guidance for building the next generation of efficient, scalable, and intelligent multi-agent applications.