CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays

📅 2026-02-26

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

This study addresses the tendency of existing large vision-language models to produce plausible yet unreliable diagnoses in chest X-ray interpretation—responses often lacking sufficient radiological evidence and exhibiting poor generalization to novel tasks. To overcome these limitations, the authors propose an evidence-driven diagnostic framework that requires no retraining, integrating large language models with a clinical toolchain to enable multi-step, verifiable, and interactive diagnostic reasoning grounded in visual evidence. As part of this work, they introduce CXReasonDial, the first benchmark dataset for multi-turn diagnostic dialogue on chest X-rays, comprising 12 distinct tasks and 1,946 annotated dialogues. Extensive evaluation on this benchmark demonstrates that the proposed approach significantly outperforms current models in both reliability and verifiability.

Technology Category

Application Category

📝 Abstract

Chest X-ray plays a central role in thoracic diagnosis, and its interpretation inherently requires multi-step, evidence-grounded reasoning. However, large vision-language models (LVLMs) often generate plausible responses that are not faithfully grounded in diagnostic evidence and provide limited visual evidence for verification, while also requiring costly retraining to support new diagnostic tasks, limiting their reliability and adaptability in clinical settings. To address these limitations, we present CXReasonAgent, a diagnostic agent that integrates a large language model (LLM) with clinically grounded diagnostic tools to perform evidence-grounded diagnostic reasoning using image-derived diagnostic and visual evidence. To evaluate these capabilities, we introduce CXReasonDial, a multi-turn dialogue benchmark with 1,946 dialogues across 12 diagnostic tasks, and show that CXReasonAgent produces faithfully grounded responses, enabling more reliable and verifiable diagnostic reasoning than LVLMs. These findings highlight the importance of integrating clinically grounded diagnostic tools, particularly in safety-critical clinical settings.

Problem

Research questions and friction points this paper is trying to address.

chest X-ray

diagnostic reasoning

evidence grounding

large vision-language models

clinical reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

evidence-grounded reasoning

diagnostic agent

chest X-ray interpretation