🤖 AI Summary
In existing large language model–based multi-agent systems, agent and tool capabilities are often coarsely modeled via single-agent description matching, leading to inaccurate agent selection. This paper proposes Agent-as-a-Graph, a retrieval method that models agents and tools as nodes and edges in a heterogeneous knowledge graph. It jointly leverages vector-based semantic retrieval and graph-structured traversal, augmented by a type-weighted reciprocal rank fusion (wRRF) mechanism for fine-grained, capability-aware agent discovery. Evaluated on LiveMCPBenchmark, our approach improves joint tool–agent retrieval performance, achieving +14.9% Recall@5 and +14.6% nDCG@5 over baselines; wRRF further contributes a +2.4% gain over standard RRF. The core contribution is the first formulation of agent capabilities as a structured, reasoning-enabled knowledge graph—enabling synergistic optimization of semantic retrieval and topological reasoning.
📝 Abstract
Recent advances in Large Language Model Multi-Agent Systems enable scalable orchestration and retrieval of specialized, parallelized subagents, each equipped with hundreds or thousands of Model Context Protocol (MCP) servers and tools. However, existing agent, MCP, and retrieval methods typically match queries against a single agent description, obscuring fine-grained tool capabilities of each agent, resulting in suboptimal agent selection. We introduce Agent-as-a-Graph retrieval, a knowledge graph retrieval augmented generation approach that represents both tools and their parent agents as nodes and edges in a knowledge graph. During retrieval, i) relevant agents and tool nodes are first retrieved through vector search, ii) we apply a type-specific weighted reciprocal rank fusion (wRRF) for reranking tools and agents, and iii) parent agents are traversed in the knowledge graph for the final set of agents. We evaluate Agent-as-a-Graph on the LiveMCPBenchmark, achieving 14.9% and 14.6% improvements in Recall@5 and nDCG@5 over prior state-of-the-art retrievers, and 2.4% improvements in wRRF optimizations.