๐ค AI Summary
This work addresses the limitations of multi-agent debate (MAD) frameworksโnamely, context inflation, poor scalability, and susceptibility to hallucination-induced errors when relying on prior signals for pruning. To overcome these challenges, the authors propose SVR-MAD, a novel framework that introduces Bayesian inference into MAD for the first time. SVR-MAD treats pre-debate signals as priors and debate outcomes as posterior evidence, dynamically constructing a communication graph that prioritizes answers validated through peer challenge. By guiding pruning decisions with posterior beliefs rather than priors alone, the method effectively mitigates the impact of hallucinations. Experimental results across multiple language models and benchmarks demonstrate that SVR-MAD reduces token consumption by up to 61% while maintaining or even improving accuracy.
๐ Abstract
Multi-Agent Debate (MAD) improves LLM-agent accuracy but suffers from rapid context growth, limiting scalability in larger multi-agent settings. Existing methods prune low-utility communications using prior signals, such as token-level log-likelihoods or LLM self-reported confidence. However, these signals become unreliable under hallucination, degrading the accuracy of MAD methods that rely on them. We propose SVR-MAD, a Bayesian-inspired MAD framework that treats pre-debate signals as priors and debate outcomes as posterior-style evidence for estimating agent correctness. SVR-MAD uses this evidence to incrementally construct the communication graph, prioritizing agents whose answers survive peer challenges. Experiments across multiple LLMs and benchmarks show that SVR-MAD reduces token cost by up to 61% while matching or improving accuracy relative to the most accurate competing MAD baseline.