On the Reliability Limits of LLM-Based Multi-Agent Planning

📅 2026-03-27

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This study investigates the reliability limits of large language model–based multi-agent planning within a delegation-based decision-making framework. Modeling the multi-agent system as a finite acyclic decision network, the work provides the first rigorous decision-theoretic characterization of its performance gap relative to an idealized centralized Bayesian decision maker, under shared context, limited linguistic communication, and optional human oversight. By integrating stochastic experiments under communication budget constraints, posterior divergence analysis, and conditional mutual information techniques—coupled with Brier score–based error quantification—the authors derive an analytical expression for the value gap. Empirical validation confirms that this gap widens significantly as communication constraints intensify, thereby revealing a quantitative relationship between linguistic compression loss and scoring rules.

Technology Category

Application Category

📝 Abstract

This technical note studies the reliability limits of LLM-based multi-agent planning as a delegated decision problem. We model the LLM-based multi-agent architecture as a finite acyclic decision network in which multiple stages process shared model-context information, communicate through language interfaces with limited capacity, and may invoke human review. We show that, without new exogenous signals, any delegated network is decision-theoretically dominated by a centralized Bayes decision maker with access to the same information. In the common-evidence regime, this implies that optimizing over multi-agent directed acyclic graphs under a finite communication budget can be recast as choosing a budget-constrained stochastic experiment on the shared signal. We also characterize the loss induced by communication and information compression. Under proper scoring rules, the gap between the centralized Bayes value and the value after communication admits an expected posterior divergence representation, which reduces to conditional mutual information under logarithmic loss and to expected squared posterior error under the Brier score. These results characterize the fundamental reliability limits of delegated LLM planning. Experiments with LLMs on a controlled problem set further demonstrate these characterizations.

Problem

Research questions and friction points this paper is trying to address.

LLM-based multi-agent planning

reliability limits

delegated decision making

communication constraints

Bayes decision theory

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent planning

LLM reliability

delegated decision making