MedCausalX: Adaptive Causal Reasoning with Self-Reflection for Trustworthy Medical Vision-Language Models

๐Ÿ“… 2026-03-24
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the vulnerability of existing medical vision-language models to spurious correlations and their lack of explicit causal reasoning mechanisms, which undermines clinical reliability. To this end, we propose MedCausalXโ€”the first end-to-end causal reasoning framework with self-reflection capabilities. Our approach introduces causal and verification tokens, constructs contrastive samples distinguishing causal from spurious associations, and integrates a two-stage adaptive reflection architecture with a trajectory-level causal correction objective to enhance interpretability and trustworthiness. Evaluated on the newly curated CRMed dataset, MedCausalX achieves state-of-the-art performance across multiple benchmarks, improving diagnostic consistency by 5.4 points, reducing hallucination by over 10 points, and attaining the highest spatial localization IoU.

Technology Category

Application Category

๐Ÿ“ Abstract
Vision-Language Models (VLMs) have enabled interpretable medical diagnosis by integrating visual perception with linguistic reasoning. Yet, existing medical chain-of-thought (CoT) models lack explicit mechanisms to represent and enforce causal reasoning, leaving them vulnerable to spurious correlations and limiting their clinical reliability. We pinpoint three core challenges in medical CoT reasoning: how to adaptively trigger causal correction, construct high-quality causal-spurious contrastive samples, and maintain causal consistency across reasoning trajectories. To address these challenges, we propose MedCausalX, an end-to-end framework explicitly models causal reasoning chains in medical VLMs. We first introduce the CRMed dataset providing fine-grained anatomical annotations, structured causal reasoning chains, and counterfactual variants that guide the learning of causal relationships beyond superficial correlations. Building upon CRMed, MedCausalX employs a two-stage adaptive reflection architecture equipped with $\langle$causal$\rangle$ and $\langle$verify$\rangle$ tokens, enabling the model to autonomously determine when and how to perform causal analysis and verification. Finally, a trajectory-level causal correction objective optimized through error-attributed reinforcement learning refines the reasoning chain, allowing the model to distinguish genuine causal dependencies from shortcut associations. Extensive experiments on multiple benchmarks show that MedCausalX consistently outperforms state-of-the-art methods, improving diagnostic consistency by +5.4 points, reducing hallucination by over 10 points, and attaining top spatial grounding IoU, thereby setting a new standard for causally grounded medical reasoning.
Problem

Research questions and friction points this paper is trying to address.

causal reasoning
medical vision-language models
spurious correlations
chain-of-thought
clinical reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

causal reasoning
self-reflection
medical vision-language models
counterfactual augmentation
reinforcement learning