🤖 AI Summary
To address low collaboration efficiency and poor output consistency in multi-LLM agent collaborative reasoning, this paper proposes a focus-diversity-driven Multi-Agent Reinforcement Learning (MARL) framework. Methodologically, it introduces the first dynamic agent subset selection algorithm grounded in task-focus diversity; designs a reward-aware, policy-adaptive conflict resolution and fusion mechanism to achieve consistent integration of multi-agent reasoning outputs; and establishes a verifiable conflict detection and robust consensus-output paradigm tailored for LLM-based multi-agent systems. Evaluated across five benchmarks, the framework improves accuracy by 5.51% over the best-performing single-agent baseline. On TruthfulQA, it significantly enhances factual consistency and adversarial robustness, while maintaining favorable inference cost-efficiency.
📝 Abstract
The advancement of Large Language Models (LLMs) and their finetuning strategies has triggered the renewed interests in multi-agent reinforcement learning. In this paper, we introduce a focal diversity-optimized multi-agent reinforcement learning approach, coined as MARL-Focal, with three unique characteristics. First, we develop an agent-fusion framework for encouraging multiple LLM based agents to collaborate in producing the final inference output for each LLM query. Second, we develop a focal-diversity optimized agent selection algorithm that can choose a small subset of the available agents based on how well they can complement one another to generate the query output. Finally, we design a conflict-resolution method to detect output inconsistency among multiple agents and produce our MARL-Focal output through reward-aware and policy-adaptive inference fusion. Extensive evaluations on five benchmarks show that MARL-Focal is cost-efficient and adversarial-robust. Our multi-agent fusion model achieves performance improvement of 5.51% compared to the best individual LLM-agent and offers stronger robustness over the TruthfulQA benchmark. Code is available at https://github.com/sftekin/rl-focal