CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays

📅 2026-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the tendency of existing large vision-language models to produce plausible yet unreliable diagnoses in chest X-ray interpretation—responses often lacking sufficient radiological evidence and exhibiting poor generalization to novel tasks. To overcome these limitations, the authors propose an evidence-driven diagnostic framework that requires no retraining, integrating large language models with a clinical toolchain to enable multi-step, verifiable, and interactive diagnostic reasoning grounded in visual evidence. As part of this work, they introduce CXReasonDial, the first benchmark dataset for multi-turn diagnostic dialogue on chest X-rays, comprising 12 distinct tasks and 1,946 annotated dialogues. Extensive evaluation on this benchmark demonstrates that the proposed approach significantly outperforms current models in both reliability and verifiability.

Technology Category

Application Category

📝 Abstract
Chest X-ray plays a central role in thoracic diagnosis, and its interpretation inherently requires multi-step, evidence-grounded reasoning. However, large vision-language models (LVLMs) often generate plausible responses that are not faithfully grounded in diagnostic evidence and provide limited visual evidence for verification, while also requiring costly retraining to support new diagnostic tasks, limiting their reliability and adaptability in clinical settings. To address these limitations, we present CXReasonAgent, a diagnostic agent that integrates a large language model (LLM) with clinically grounded diagnostic tools to perform evidence-grounded diagnostic reasoning using image-derived diagnostic and visual evidence. To evaluate these capabilities, we introduce CXReasonDial, a multi-turn dialogue benchmark with 1,946 dialogues across 12 diagnostic tasks, and show that CXReasonAgent produces faithfully grounded responses, enabling more reliable and verifiable diagnostic reasoning than LVLMs. These findings highlight the importance of integrating clinically grounded diagnostic tools, particularly in safety-critical clinical settings.
Problem

Research questions and friction points this paper is trying to address.

chest X-ray
diagnostic reasoning
evidence grounding
large vision-language models
clinical reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

evidence-grounded reasoning
diagnostic agent
chest X-ray interpretation
vision-language models
clinical decision support
🔎 Similar Papers
No similar papers found.
H
Hyungyung Lee
KAIST
H
Hangyul Yoon
KAIST
Edward Choi
Edward Choi
KAIST
Machine LearningArtificial IntelligenceHealthcare