Co-TAP: Three-Layer Agent Interaction Protocol Technical Report

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address core challenges in multi-agent systems—namely interoperability, collaborative efficiency, and knowledge sharing—this paper proposes Co-TAP, a three-layer agent interaction protocol. It comprises: (1) the Human-Agent Interaction (HAI) protocol, an event-driven interface for human–agent coordination; (2) the Unified Agent Protocol (UAP), enabling seamless interoperation among heterogeneous agents; and (3) the MEK cognitive chain, which formalizes the transformation of experiential data into structured knowledge. Co-TAP is the first framework to holistically standardize interaction, connectivity, and cognition, leveraging event-driven architecture, unified service discovery, cross-protocol translation, and standardized memory extraction. These mechanisms collectively support dynamic cross-platform agent onboarding, real-time communication, and collaborative learning. Empirical evaluation demonstrates significant improvements in interaction reliability and knowledge reuse efficiency. Co-TAP thus establishes a scalable engineering paradigm and foundational theoretical framework for distributed artificial intelligence.

Technology Category

Application Category

📝 Abstract
This paper proposes Co-TAP (T: Triple, A: Agent, P: Protocol), a three-layer agent interaction protocol designed to address the challenges faced by multi-agent systems across the three core dimensions of Interoperability, Interaction and Collaboration, and Knowledge Sharing. We have designed and proposed a layered solution composed of three core protocols: the Human-Agent Interaction Protocol (HAI), the Unified Agent Protocol (UAP), and the Memory-Extraction-Knowledge Protocol (MEK). HAI focuses on the interaction layer, standardizing the flow of information between users, interfaces, and agents by defining a standardized, event-driven communication paradigm. This ensures the real-time performance, reliability, and synergy of interactions. As the core of the infrastructure layer, UAP is designed to break down communication barriers among heterogeneous agents through unified service discovery and protocol conversion mechanisms, thereby enabling seamless interconnection and interoperability of the underlying network. MEK, in turn, operates at the cognitive layer. By establishing a standardized ''Memory (M) - Extraction (E) - Knowledge (K)'' cognitive chain, it empowers agents with the ability to learn from individual experiences and form shareable knowledge, thereby laying the foundation for the realization of true collective intelligence. We believe this protocol framework will provide a solid engineering foundation and theoretical guidance for building the next generation of efficient, scalable, and intelligent multi-agent applications.
Problem

Research questions and friction points this paper is trying to address.

Addresses multi-agent system interoperability, interaction, and knowledge sharing challenges
Breaks communication barriers among heterogeneous agents for seamless interconnection
Establishes a cognitive chain for agents to learn and share knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-Agent Interaction Protocol ensures reliable event-driven communication
Unified Agent Protocol enables seamless interconnection of heterogeneous agents
Memory-Extraction-Knowledge Protocol establishes standardized cognitive chain for learning
🔎 Similar Papers
No similar papers found.
S
Shunyu An
M
Miao Wang
Y
Yongchao Li
D
Dong Wan
Lina Wang
Lina Wang
Professor, Wuhan University
Computer Security
L
Ling Qin
L
Liqin Gao
C
Congyao Fan
Z
Zhiyong Mao
J
Jiange Pu
W
Wenji Xia
D
Dong Zhao
R
Rui Hu
J
Ji Lu
G
Guiyue Zhou
B
Baoyu Tang
Y
Yanqin Gao
Y
Yongsheng Du
D
Daigang Xu
L
Lingjun Huang
B
Baoli Wang
Xiwen Zhang
Xiwen Zhang
not Helixon anymore :)
LLMdiffusion modelcomputer systemsmachine learningcomputational biology
L
Luyao Wang
Shilong Liu
Shilong Liu
RS@ByteDance, PhD@THU
Computer VisionObject DetectionVisual GroundingMulti-ModalityMultimodal Agent