Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Multi-agent large language models (LLMs) in clinical decision-making often suffer from premature convergence due to “silent consensus,” leading to overlooked differential diagnoses. Method: We propose the role-specialized Catfish Agent, which actively injects structured skepticism via two synergistic mechanisms: complexity-aware dissent triggering and tone-calibrated critical generation. The framework integrates medical knowledge guidance, dynamic complexity assessment, and controllable-tone dissent generation. Contribution/Results: Evaluated on nine medical question-answering and three medical visual question-answering benchmarks, our approach consistently outperforms state-of-the-art single- and multi-agent baselines—including GPT-4o and DeepSeek-R1—achieving statistically significant gains in diagnostic accuracy. It markedly enhances the robustness and interpretability of clinical reasoning while mitigating groupthink tendencies.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have demonstrated strong potential in clinical question answering, with recent multi-agent frameworks further improving diagnostic accuracy via collaborative reasoning. However, we identify a recurring issue of Silent Agreement, where agents prematurely converge on diagnoses without sufficient critical analysis, particularly in complex or ambiguous cases. We present a new concept called Catfish Agent, a role-specialized LLM designed to inject structured dissent and counter silent agreement. Inspired by the ``catfish effect'' in organizational psychology, the Catfish Agent is designed to challenge emerging consensus to stimulate deeper reasoning. We formulate two mechanisms to encourage effective and context-aware interventions: (i) a complexity-aware intervention that modulates agent engagement based on case difficulty, and (ii) a tone-calibrated intervention articulated to balance critique and collaboration. Evaluations on nine medical Q&A and three medical VQA benchmarks show that our approach consistently outperforms both single- and multi-agent LLMs frameworks, including leading commercial models such as GPT-4o and DeepSeek-R1.

Problem

Research questions and friction points this paper is trying to address.

Addressing premature consensus in multi-agent LLMs for clinical diagnosis

Introducing Catfish Agent to counteract silent agreement bias

Improving diagnostic accuracy via structured dissent in complex cases

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Catfish Agent for structured dissent

Uses complexity-aware intervention for case difficulty

Employs tone-calibrated intervention to balance critique

🔎 Similar Papers

AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments