A Token/KV-Cache Communication Media Selection and Resource Allocation Strategy for Multi-Agent Collaboration

📅 2026-05-25

📈 Citations: 0

✨ Influential: 0

career value

267K/year

🤖 AI Summary

This work addresses the challenge of embodied multi-agent collaboration under 6G wireless constraints, where a fundamental trade-off exists between computation and communication overheads when choosing between token-based and key-value (KV) cache-based interaction mechanisms. The lack of a joint optimization framework that accounts for both channel and computational resources motivates the proposed adaptive framework for joint medium selection and wireless resource allocation. By modeling the end-to-end latency of both interaction modalities, the authors formulate a latency-minimization optimization problem and develop a low-complexity heuristic algorithm. Theoretical analysis provides the first characterization of the relative performance of the two media under varying system parameters. Extensive simulations demonstrate that the proposed approach significantly outperforms single-medium baselines, effectively reducing collaboration latency and enhancing the efficiency and robustness of multi-agent systems in 6G networks.

📝 Abstract

The convergence of large language models (LLMs) with 6G networks is fostering a paradigm of autonomous multi-agent cooperation, which in turn is expected to substantially increase east-west traffic. Although latent-space interaction mechanisms can enable more efficient collaboration than symbolic natural-language (NL) exchanges, prior work often abstracts away the associated communication overhead under practical wireless constraints. In embodied multi-agent settings, heterogeneous interaction media incur disparate inference and transmission costs, thereby inducing an inherent end-to-end (E2E) latency trade-off. To address this, we propose a joint design that integrates communication-media selection with wireless resource allocation. Through analytical characterization and simulation-based evaluation, we show that neither token-based transmission nor key-value (KV) cache-based transmission is uniformly optimal across operating regimes, as performance depends critically on system parameters such as available computational resources and channel conditions. Accordingly, we formulate a joint optimization problem aimed at minimizing the E2E latency of multi-agent collaboration and develop a low-complexity joint media selection and resource allocation (JMSRA) algorithm. Numerical results further confirm that, by adaptively coordinating the interaction media and bandwidth allocation over heterogeneous links, the proposed scheme achieves markedly reduced E2E latency relative to conventional NL-only and KV-cache-only baselines, enabling efficient and robust multi-agent collaboration in future wireless networks.

Problem

Research questions and friction points this paper is trying to address.

multi-agent collaboration

communication media selection

resource allocation

end-to-end latency

KV-cache transmission

Innovation

Methods, ideas, or system contributions that make the work stand out.

KV-cache transmission

token-based transmission

multi-agent collaboration