🤖 AI Summary
The impact of role assignment strategies on reasoning performance in Multi-Agent Debate (MAD) remains systematically unexplored, particularly in real-world scenarios where ground truth is unknown and no effective mechanism exists. Method: We identify the critical influence of positional placement of viewpoints within the debate sequence and propose the “Truth-Last” strategy—deliberately assigning the correct viewpoint to a later position in the debate chain. Furthermore, we introduce Multi-Agent Debate Consistency (MADC), a novel mechanism that automatically identifies high-credibility agents via path-level consistency, thereby addressing the truth-agnostic challenge. Contribution/Results: Extensive experiments across nine mainstream large language models—including a distilled variant of DeepSeek-R1—demonstrate that our approach achieves an average 22% improvement on complex reasoning tasks. MADC consistently outperforms conventional MAD across all models, effectively overcoming key performance bottlenecks in multi-agent collaborative reasoning.
📝 Abstract
Recent studies on LLM agent scaling have highlighted the potential of Multi-Agent Debate (MAD) to enhance reasoning abilities. However, the critical aspect of role allocation strategies remains underexplored. In this study, we demonstrate that allocating roles with differing viewpoints to specific positions significantly impacts MAD's performance in reasoning tasks. Specifically, we find a novel role allocation strategy,"Truth Last", which can improve MAD performance by up to 22% in reasoning tasks. To address the issue of unknown truth in practical applications, we propose the Multi-Agent Debate Consistency (MADC) strategy, which systematically simulates and optimizes its core mechanisms. MADC incorporates path consistency to assess agreement among independent roles, simulating the role with the highest consistency score as the truth. We validated MADC across a range of LLMs (9 models), including the DeepSeek-R1 Distilled Models, on challenging reasoning tasks. MADC consistently demonstrated advanced performance, effectively overcoming MAD's performance bottlenecks and providing a crucial pathway for further improvements in LLM agent scaling.