The Value of Variance: Mitigating Debate Collapse in Multi-Agent Systems via Uncertainty-Driven Policy Optimization

📅 2026-02-06

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the problem of “debate collapse” in multi-agent debate systems, where erroneous reasoning often dominates due to the absence of effective detection and intervention mechanisms. The authors propose a hierarchical uncertainty quantification framework that measures behavioral uncertainty at the individual, interaction, and system levels, introducing it for the first time as a diagnostic indicator of system failure. Building on this, they develop an uncertainty-driven strategy optimization method that dynamically penalizes self-contradictory statements, inter-agent conflicts, and low-confidence outputs. Experimental results demonstrate that the proposed approach significantly improves decision accuracy, reduces internal inconsistency, and achieves reliable calibration of multi-agent systems across multiple benchmarks.

Technology Category

Application Category

📝 Abstract

Multi-agent debate (MAD) systems improve LLM reasoning through iterative deliberation, but remain vulnerable to debate collapse, a failure type where final agent decisions are compromised on erroneous reasoning. Existing methods lack principled mechanisms to detect or prevent such failures. To address this gap, we first propose a hierarchical metric that quantifies behavioral uncertainty at three levels: intra-agent (individual reasoning uncertainty), inter-agent (interactive uncertainty), and system-level (output uncertainty). Empirical analysis across several benchmarks reveals that our proposed uncertainty quantification reliably indicates system failures, which demonstrates the validity of using them as diagnostic metrics to indicate the system failure. Subsequently, we propose a mitigation strategy by formulating an uncertainty-driven policy optimization to penalize self-contradiction, peer conflict, and low-confidence outputs in a dynamic debating environment. Experiments demonstrate that our proposed uncertainty-driven mitigation reliably calibrates the multi-agent system by consistently improving decision accuracy while reducing system disagreement.

Problem

Research questions and friction points this paper is trying to address.

debate collapse

multi-agent systems

uncertainty quantification

LLM reasoning

system failure

Innovation

Methods, ideas, or system contributions that make the work stand out.

uncertainty quantification

multi-agent debate

debate collapse