Dialectic-Med: Mitigating Diagnostic Hallucinations via Counterfactual Adversarial Multi-Agent Debate

📅 2026-04-13

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the susceptibility of multimodal large language models to confirmation bias in medical diagnosis, which often leads to visual hallucinations and a lack of effective error-correction mechanisms. To mitigate these issues, the authors propose the first three-agent multi-agent framework incorporating a dialectical adversarial mechanism: a proponent formulates an initial diagnosis, an opponent challenges it using visually grounded counterfactual evidence, and a mediator synthesizes decisions through a weighted consensus graph that explicitly models the falsification process to enable verifiable and rigorous reasoning. Integrating visual counterfactual retrieval, multimodal alignment, and fine-tuning, the method achieves state-of-the-art performance on MIMIC-CXR-VQA, VQA-RAD, and PathVQA, significantly reducing hallucination rates while improving both diagnostic accuracy and explanation fidelity.

Technology Category

Application Category

📝 Abstract

Multimodal Large Language Models (MLLMs) in healthcare suffer from severe confirmation bias, often hallucinating visual details to support initial, potentially erroneous diagnostic hypotheses. Existing Chain-of-Thought (CoT) approaches lack intrinsic correction mechanisms, rendering them vulnerable to error propagation. To bridge this gap, we propose Dialectic-Med, a multi-agent framework that enforces diagnostic rigor through adversarial dialectics. Unlike static consensus models, Dialectic-Med orchestrates a dynamic interplay between three role-specialized agents: a proponent that formulates diagnostic hypotheses; an opponent equipped with a novel visual falsification module that actively retrieves contradictory visual evidence to challenge the Proponent; and a mediator that resolves conflicts via a weighted consensus graph. By explicitly modeling the cognitive process of falsification, our framework guarantees that diagnostic reasoning is tightly grounded in verified visual regions. Empirical evaluations on MIMIC-CXR-VQA, VQA-RAD, and PathVQA demonstrate that Dialectic-Med not only achieves state-of-the-art performance but also fundamentally enhances the trustworthiness of the reasoning process. Beyond accuracy, our approach significantly enhances explanation faithfulness and decisively mitigates hallucinations, establishing a new standard over single-agent baselines.

Problem

Research questions and friction points this paper is trying to address.

diagnostic hallucinations

confirmation bias

multimodal large language models

error propagation

visual falsification

Innovation

Methods, ideas, or system contributions that make the work stand out.

adversarial multi-agent debate

visual falsification

diagnostic hallucination mitigation